docs/data.html

   1 <!--#include virtual="header.html" -->
   2
   3 <style type="text/css">
   4   code { white-space: pre; border: 1px dashed black; display: block; }
   5   pre  { white-space: pre; border: 1px dashed black; }
   6 </style>
   7
   8 <h2>dygraphs Data Format</h2>
   9
  10 <p>When you create a Dygraph object, your code looks something like
  11 this:</p>
  12
  13 <code>
  14   g = new Dygraph(document.getElementById("div"),
  15                   <i>data</i>,
  16                   { <i>options</i> });
  17
  18 </code>
  19
  20 <p>This document is about what you can put in the <i>data</i>
  21 parameter.</p>
  22
  23 <p>There are five types of input that dygraphs will accept:</p>
  24 <ol>
  25   <li><a href="#csv">CSV data</a>
  26   <li><a href="#url">URL</a>
  27   <li><a href="#array">array (native format)</a>
  28   <li><a href="#function">function</a>
  29   <li><a href="#datatable">DataTable</a>
  30 </ol>
  31
  32 <p>These are all discussed below. If you're trying to debug why your input
  33 won't parse, <b>check the JS error console</b>. dygraphs tries to log
  34 informative errors explaining what's wrong with your data, and these can
  35 often point you in the right direction.</p>
  36
  37 <p>There are several options which affect how your input data is
  38 interpreted. These are:</p>
  39 <ul>
  40   <li> <i>xValueParser</i> affects CSV only.
  41   <li> <i>errorBars</i> affects all input types.
  42   <li> <i>customBars</i> affects all input types.
  43   <li> <i>fractions</i> affects all input types.
  44   <li> <i>labels</i> affects all input types.
  45 </ul>
  46
  47 <a name="csv"></a>
  48   <h3>CSV</h3>
  49 <p>Here's an example of what CSV data should look like:</p>
  50 <pre>
  51 Date,Series1,Series2
  52 2009/07/12,100,200  # comments are OK on data lines
  53 2009/07/19,150,201
  54 </pre>
  55
  56 <p>"CSV" is actually a bit of a misnomer: the data can be tab-delimited,
  57 too. The delimiter is set by the <i>delimiter</i> option. It default to ",".
  58 If no delimiter is found in the first row, it switches over to tab.</p>
  59
  60 <p>CSV parsing can be split into three parts: headers, x-value and
  61 y-values.</p>
  62
  63 <h4>Headers</h4>
  64 <p>If you don't specify the <i>labels</i> option, dygraphs will look at the
  65 first line of your CSV data to get the labels. If you see numbers for series
  66 labels when you hover over the dygraph, it's likely because your first line
  67 contains data but is being parsed as a label. The solution is to either add
  68 a header line or specify the labels like this:</p>
  69
  70 <code>
  71   new Dygraph(el,
  72               "2009/07/12,100,200\n" +
  73               "2009/07/19,150,201\n",
  74               { labels: [ "Date", "Series1", "Series2" ] });
  75 </code>
  76
  77 <h4>x-values</h4>
  78 <p>Once the headers are parsed, dygraphs needs to determine what the type of
  79 the x values is. They're either dates or numbers. To make this
  80 determination, it looks at the first column of the first row ("2009/07/12"
  81 in the example above). Here's the heuristic: if it contains a '-' or a '/',
  82 or otherwise doesn't parse as a float, the it's a date. Otherwise, it's a
  83 number.</p>
  84
  85 <p>Once the type is determined, that doesn't mean all the values will parse
  86 correctly. The general rule is:<p>
  87
  88 <ul>
  89   <li>For dates, your strings have to be parseable by <i>Date.parse</i>.
  90   <li>For numbers, your strings have to be parseable by <i>parseFloat</i>.
  91 </ul>
  92
  93 <p>You can manually verify this using a JavaScript console. If a value
  94 doesn't parse, dygraphs will put a warning about it on your console. But
  95 beware: different browsers support different date formats!</p>
  96
  97 <p>Here are some valid date formats:</p>
  98 <ul>
  99   <li>2009-07-12</li>
 100   <li>2009/07/12</li>
 101   <li>2009/07/12 12</li>
 102   <li>2009/07/12 12:34</li>
 103   <li>2009/07/12 12:34:56</li>
 104 </ul>
 105
 106 <p>If you specify the <i>xValueParser</i> option, then all this detection is
 107 bypassed and your function is called instead. Your parser function takes in
 108 a string and needs to return a number. For dates/times, you should return
 109 milliseconds since epoch. You may also want to specify a few other options
 110 to make sure that everything gets displayed properly.<p>
 111
 112 <p>Here's code which parses a CSV file with unix timestamps in the first
 113 column:</p>
 114
 115 <code>
 116   new Dygraph(el,
 117               "Date,Series1,Series2\n" +
 118               "1247382000,100,200\n" +
 119               "1247986800,150,201\n",
 120               {
 121                 axis : {
 122                   x : {
 123                     valueFormatter: Dygraph.dateString_,
 124                     valueParser: function(x) { return 1000*parseInt(x); },
 125                     ticker: Dygraph.dateTicker
 126                   }
 127                 }
 128               });
 129 </code>
 130
 131 <h4>y-values</h4>
 132 <p>Dependent (y-axis) values are simpler than x-values because they're
 133 always numbers. The complexity here comes from the various ways that you can
 134 specify the uncertainty in your measurements.<p>
 135
 136 <p>If your y-values are just numbers, then they need to be parseable by
 137 JavaScript's parseFloat function. Acceptable formats include:</p>
 138
 139 <ul>
 140   <li>12
 141   <li>-12
 142   <li>12.
 143   <li>12.3
 144   <li>1.24e+1
 145   <li>-1.24e+1
 146 </ul>
 147
 148 <p>If you have missing data, just leave the column blank (your CSV file will
 149 probably contain a ",," in it).</p>
 150
 151 <p>If your numbers have uncertainty associated with them, then there are
 152 three basic ways to express this: using fractions, standard deviations or
 153 explicit ranges.</p>
 154
 155 <h5>Fractions</h5>
 156 <p>If you specify the <i>fractions</i> option, then your data will all be
 157 interpreted as ratios between zero and one. This is often the case if you're
 158 plotting a percentage.</p>
 159
 160 <code>
 161   new Dygraph(el,
 162               "X,Frac1,Frac2\n" +
 163               "1,1/2,3/4\n"+
 164               "2,1/3,2/3\n"+
 165               "3,2/3,17/49\n"+
 166               "4,25/30,100/200",
 167               { fractions: true });
 168 </code>
 169
 170 <p>Why not just divide the fractions out yourself? There are two attractive
 171 reasons not to:</p>
 172
 173 <ul>
 174   <li>If you set both <i>fractions</i> and <i>errorBars</i>, then the
 175 denominator is interpreted as a sample size and dygraphs will plot <a
 176   href="http://en.wikipedia.org/wiki/Binomial_proportion_confidence_interval">Wilson
 177   binomial proportion confidence intervals</a> around each point.
 178
 179   <li>If you set <i>showRoller</i>, then dygraphs will combine the values as
 180   fractions. If two point are <i>a/b</i> and <i>c/d</i>, it will plot
 181   <i>(a+b) / (c+d)</i> rather than <i>(a/b + c/d) / 2</i>, which is what
 182   you'd get if you divided the fractions through. This will also shrink the
 183   confidence intervals.</li>
 184 </ul>
 185
 186 <h5>Standard Deviations</h5>
 187 <p>Often you have a measurement and also a measure of its uncertainty: a
 188 standard deviation. If you specify the <i>errorBars</i> option, dygraphs
 189 will look for alternating value and standard deviation columns in your CSV
 190 data.  Here's what it should look like:</p>
 191
 192 <code>
 193   new Dygraph(el,
 194               "X,Y1,Y2\n" +
 195               "1,10,5,20,5\n" +
 196               "2,12,5,22,5\n",
 197               { errorBars: true });
 198 </code>
 199
 200 <p>The "5" values are standard deviations. When each point is plotted, a
 201 2-standard deviation region around it is shaded, resulting in a 95%
 202 confidence interval. If you want more or less confidence, you can set the
 203 <i>sigma</i> option to something other than 2.0.</p>
 204
 205 <p>When you roll data with standard deviations, dygraphs will plot the
 206 average of your values in each rolling period and the RMS value of your
 207 standard deviations: sqrt(std1 + std2 + std3 + ... + stdN)/N.</p>
 208
 209 <h5>Custom error bars</h5>
 210 <p>Sometimes your data has asymetric uncertainty or you want to specify
 211 something else with the error bars around a point. One example of this is
 212 the "temperatures" demo on the <a href="http://danvk.org/dygraphs">dygraphs
 213   home page.</a>, where the point is the daily average and the bars denote
 214 the low and high temperatures for the day.</p>
 215
 216 <p>To specify this format, set the <i>customBars</i> option. Your CSV values
 217 should each be three numbers separated by semicolons ("low;mid;high").
 218 Here's an example:</p>
 219
 220 <code>
 221   new Dygraph(el,
 222               "X,Y1,Y2\n" +
 223               "1,10;20;30,20;5;25\n" +
 224               "2,10;25;35,20;10;25\n",
 225               { customBars: true });
 226 </code>
 227
 228 <p>The middle value need not lie between the low and high values. If you set
 229 a rolling period, the three values will all be averaged independently.</p>
 230
 231
 232 <a name="url"></a>
 233 <h3>URL</h3>
 234 <p>If you pass in a URL, dygraphs will issue an XMLHttpRequest for it and
 235 attempt to parse the returned data as CSV.
 236 </p>
 237
 238 <p><i>Common problems</i>. Make sure the URL is accessible and returns data
 239 in text format (as opposed to a CSV file with an HTML header). You can see
 240 what the response looks like by checking your JS console or by requesting
 241 the URL yourself.</p>
 242
 243
 244 <a name="array"></a>
 245 <h3>Array (native format)</h3>
 246 <p>If you'll be constructing your data set from a server-side program (or
 247 from JavaScript) then you're better off producing an array than CSV data.
 248 This saves the cost of parsing the CSV data and also avoids common parser
 249 errors.</p>
 250
 251 <p>The downside is that it's harder to look at your data (you'll need to use
 252 a JS debugger) and that the data format is a bit less clear for values with
 253 uncertainties.</p>
 254
 255 <p>Here's an example of "native format":</p>
 256
 257 <code>
 258   new Dygraph(document.getElementById("graphdiv2"),
 259               [
 260                 [1,10,100],
 261                 [2,20,80],
 262                 [3,50,60],
 263                 [4,70,80]
 264               ],
 265               {
 266                 labels: [ "x", "A", "B" ]
 267               });
 268 </code>
 269
 270 <h4>Headers</h4>
 271 <p>Headers for native format must be specified via the <i>labels</i>
 272 option. There's no other way to set them.</p>
 273
 274 <h4>x-values</h4>
 275 <p>If you want your x-values to be dates, you'll need to use specify a Date
 276 object in the first column. Otherwise, specify a number. Here's a sample
 277 array with dates on the x-axis:</p>
 278
 279 <code>
 280   [
 281     [ new Date("2009/07/12"), 100, 200 ],
 282     [ new Date("2009/07/19"), 150, 220 ]
 283   ]
 284 </code>
 285
 286 <h4>y-values</h4>
 287 <p>You can specify <i>errorBars</i>, <i>fractions</i> or <i>customBars</i>
 288 with the array format. If you specify any of these, the values become arrays
 289 (rather than numbers). Here's what the format looks like for each one:</p>
 290
 291 <code>
 292   <i>errorBars</i>: [x, [value1, std1], [value2, std2], ...]
 293   <i>fractions</i>: [x, [num1, den1], [num2, den2], ...]
 294   <i>customBars</i>: [x, [low1, val1, high1], [low2, val2, high2], ...]
 295 </code>
 296
 297 <p>To specify missing data, set the value to null or NaN. You may not set a value
 298 inside an array to null or NaN. Use null or NaN instead of the entire array.
 299 The only difference between the two is when the option
 300 <a href="options.html#conectSeparatedPoints">connectSeparatedPoints</a>
 301 true. In that case, the gaps created by nulls are filled in, and gaps
 302 created by NaNs are preserved.
 303 </p>
 304
 305 <a name="function"></a>
 306 <h3>Functions</h3>
 307
 308 <p>You can specify a function that returns any of the other types. If
 309 <i>x</i> is a valid piece of dygraphs input, then so is</p>
 310
 311 <code>
 312   function() { return x; }
 313 </code>
 314
 315 Functions can return strings, arrays, data tables, URLs, or any other data type.
 316
 317 <a name="datatable"></a>
 318 <h3>DataTable</h3>
 319 <p>You can also specify a Google Visualization Library <a
 320   href="http://code.google.com/apis/visualization/documentation/reference.html#DataTable">DataTable</a>
 321 object as your input data. This lets you easily switch between dygraphs and
 322 other gviz visualizations such as the Annotated Timeline. It also lets you
 323 embed a Dygraph in a Google Spreadsheet.</p>
 324
 325 <p>You'll need to set your first column's type to one of "number", "date"
 326 or "datetime".</p>
 327
 328 <pre>
 329 DataTable TODO:
 330 - When to use Dygraph.GvizWrapper
 331 - how to specify fractions
 332 - how to specify missing data
 333 - how to specify value + std. dev.
 334 - how to specify [low, middle, high]
 335 - walkthrough of embedding a gadget in google docs/on a web page
 336 - walkthrough of using std. dev. in a spreadsheet chart
 337 </pre>
 338
 339 <!--#include virtual="footer.html" -->