docs/data.html

   1 <html>
   2   <head>
   3     <title>dygraphs input types</title>
   4     <style type="text/css">
   5       code { white-space: pre; }
   6       pre  { white-space: pre; }
   7     </style>
   8   </head>
   9   <body>
  10     <h2>dygraphs Data Format</h2>
  11
  12     <p>When you create a Dygraph object, your code looks something like
  13     this:</p>
  14
  15     <code>
  16       g = new Dygraph(document.getElementById("div"),
  17                       <i>data</i>,
  18                       { <i>options</i> });
  19     </code>
  20
  21     <p>This document is about what you can put in the <i>data</i>
  22     parameter.</p>
  23
  24     <p>There are five types of input that dygraphs will accept:</p>
  25     <ol>
  26       <li>CSV data
  27       <li>URL
  28       <li>array (native format)
  29       <li>function
  30       <li>DataTable
  31     </ol>
  32
  33     <p>These are all discussed below. If you're trying to debug why your input
  34     won't parse, <b>check the JS error console</b>. dygraphs tries to log
  35     informative errors explaining what's wrong with your data, and these can
  36     often point you in the right direction.</p>
  37
  38     <p>There are several options which affect how your input data is
  39     interpreted. These are:
  40     <ul>
  41       <li> <i>xValueParser</i> affects CSV only.
  42       <li> <i>errorBars</i> affects all input types.
  43       <li> <i>customBars</i> affects all input types.
  44       <li> <i>fractions</i> affects all input types.
  45       <li> <i>labels</i> affects all input types.
  46     </ul>
  47     </p>
  48
  49     <h3>CSV</h3>
  50     <p>Here's an example of what CSV data should look like:</p>
  51     <pre>
  52     Date,Series1,Series2
  53     2009/07/12,100,200  # comments are OK on data lines
  54     2009/07/19,150,201
  55     </pre>
  56
  57     <p>"CSV" is actually a bit of a misnomer: the data can be tab-delimited,
  58     too. The delimiter is set by the <i>delimiter</i> option. It default to ",".
  59     If no delimiter is found in the first row, it switches over to tab.</p>
  60
  61     <p>CSV parsing can be split into three parts: headers, x-value and
  62     y-values.</p>
  63
  64     <h4>Headers</h4>
  65     <p>If you don't specify the <i>labels</i> option, dygraphs will look at the
  66     first line of your CSV data to get the labels. If you see numbers for series
  67     labels when you hover over the dygraph, it's likely because your first line
  68     contains data but is being parsed as a label. The solution is to either add
  69     a header line or specify the labels like this:</p>
  70
  71     <code>
  72       new Dygraph(el,
  73                   "2009/07/12,100,200\n" +
  74                   "2009/07/19,150,201\n",
  75                   { labels: [ "Date", "Series1", "Series2" ] });
  76     </code>
  77
  78     <h4>x-values</h4>
  79     <p>Once the headers are parsed, dygraphs needs to determine what the type of
  80     the x values is. They're either dates or numbers. To make this
  81     determination, it looks at the first column of the first row ("2009/07/12"
  82     in the example above). Here's the heuristic: if it contains a '-' or a '/',
  83     or otherwise doesn't parse as a float, the it's a date. Otherwise, it's a
  84     number.</p>
  85
  86     <p>Once the type is determined, that doesn't mean all the values will parse
  87     correctly. The general rule is:<p>
  88
  89     <ul>
  90       <li>For dates, your strings have to be parseable by <i>Date.parse</i>.
  91       <li>For numbers, your strings have to be parseable by <i>parseFloat</i>.
  92     </ul>
  93
  94     <p>You can manually verify this using a JavaScript console. If a value
  95     doesn't parse, dygraphs will put a warning about it on your console. But
  96     beware: different browsers support different date formats!</p>
  97
  98     <p>Here are some valid date formats:</p>
  99     <ul>
 100       <li>2009-07-12</li>
 101       <li>2009/07/12</li>
 102       <li>2009/07/12 12</li>
 103       <li>2009/07/12 12:34</li>
 104       <li>2009/07/12 12:34:56</li>
 105     </ul>
 106
 107     <p>If you specify the <i>xValueParser</i> option, then all this detection is
 108     bypassed and your function is called instead. Your parser function takes in
 109     a string and needs to return a number. For dates/times, you should return
 110     milliseconds since epoch. You may also want to specify a few other options
 111     to make sure that everything gets displayed properly.<p>
 112
 113     <p>Here's code which parses a CSV file with unix timestamps in the first
 114     column:</p>
 115
 116     <code>
 117       new Dygraph(el,
 118                   "Date,Series1,Series2\n" +
 119                   "1247382000,100,200\n" +
 120                   "1247986800,150,201\n",
 121                   {
 122                     xValueFormatter: Dygraph.dateString_,
 123                     xValueParser: function(x) { return 1000*parseInt(x); },
 124                     xTicker: Dygraph.dateTicker
 125                   });
 126     </code>
 127
 128     <h4>y-values</h4>
 129     <p>Dependent (y-axis) values are simpler than x-values because they're
 130     always numbers. The complexity here comes from the various ways that you can
 131     specify the uncertainty in your measurements.<p>
 132
 133     <p>If your y-values are just numbers, then they need to be parseable by
 134     JavaScript's parseFloat function. Acceptable formats include:</p>
 135
 136     <ul>
 137       <li>12
 138       <li>-12
 139       <li>12.
 140       <li>12.3
 141       <li>1.24e+1
 142       <li>-1.24e+1
 143     </ul>
 144
 145     <p>If you have missing data, just leave the column blank (your CSV file will
 146     probably contain a ",," in it).</p>
 147
 148     <p>If your numbers have uncertainty associated with them, then there are
 149     three basic ways to express this: using fractions, standard deviations or
 150     explicit ranges.</p>
 151
 152     <h5>Fractions</h5>
 153     <p>If you specify the <i>fractions</i> option, then your data will all be
 154     interpreted as ratios between zero and one. This is often the case if you're
 155     plotting a percentage.</p>
 156
 157     <code>
 158       new Dygraph(el,
 159                   "X,Frac1,Frac2\n" +
 160                   "1,1/2,3/4\n"+
 161                   "2,1/3,2/3\n"+
 162                   "3,2/3,17/49\n"+
 163                   "4,25/30,100/200",
 164                   { fractions: true });
 165     </code>
 166
 167     <p>Why not just divide the fractions out yourself? There are two attractive
 168     reasons not to:</p>
 169
 170     <ul>
 171       <li>If you set both <i>fractions</i> and <i>errorBars</i>, then the
 172     denominator is interpreted as a sample size and dygraphs will plot <a
 173       href="http://en.wikipedia.org/wiki/Binomial_proportion_confidence_interval">Wilson
 174       binomial proportion confidence intervals</a> around each point.
 175
 176       <li>If you set <i>showRoller</i>, then dygraphs will combine the values as
 177       fractions. If two point are <i>a/b</i> and <i>c/d</i>, it will plot
 178       <i>(a+b) / (c+d)</i> rather than <i>(a/b + c/d) / 2</i>, which is what
 179       you'd get if you divided the fractions through. This will also shrink the
 180       confidence intervals.</li>
 181     </ul>
 182
 183     <h5>Standard Deviations</h5>
 184     <p>Often you have a measurement and also a measure of its uncertainty: a
 185     standard deviation. If you specify the <i>errorBars</i> option, dygraphs
 186     will look for alternating value and standard deviation columns in your CSV
 187     data.  Here's what it should look like:</p>
 188
 189     <code>
 190       new Dygraph(el,
 191                   "X,Y1,Y2\n" +
 192                   "1,10,5,20,5\n" +
 193                   "2,12,5,22,5\n",
 194                   { errorBars: true });
 195     </code>
 196
 197     <p>The "5" values are standard deviations. When each point is plotted, a
 198     2-standard deviation region around it is shaded, resulting in a 95%
 199     confidence interval. If you want more or less confidence, you can set the
 200     <i>sigma</i> option to something other than 2.0.</p>
 201
 202     <p>When you roll data with standard deviations, dygraphs will plot the
 203     average of your values in each rolling period and the RMS value of your
 204     standard deviations: sqrt(std1 + std2 + std3 + ... + stdN)/N.</p>
 205
 206     <h5>Custom error bars</h5>
 207     <p>Sometimes your data has asymetric uncertainty or you want to specify
 208     something else with the error bars around a point. One example of this is
 209     the "temperatures" demo on the <a href="http://danvk.org/dygraphs">dygraphs
 210       home page.</a>, where the point is the daily average and the bars denote
 211     the low and high temperatures for the day.</p>
 212
 213     <p>To specify this format, set the <i>customBars</i> option. Your CSV values
 214     should each be three numbers separated by semicolons ("low;mid;high").
 215     Here's an example:</p>
 216
 217     <code>
 218       new Dygraph(el,
 219                   "X,Y1,Y2\n" +
 220                   "1,10;20;30,20;5;25\n" +
 221                   "2,10;25;35,20;10;25\n",
 222                   { customBars: true });
 223     </code>
 224
 225     <p>The middle value need not lie between the low and high values. If you set
 226     a rolling period, the three values will all be averaged independently.</p>
 227
 228
 229     <h3>URL</h3>
 230     <p>If you pass in a URL, dygraphs will issue an XMLHttpRequest for it and
 231     attempt to parse the returned data as CSV.
 232     </p>
 233
 234     <p><i>Common problems</i>. Make sure the URL is accessible and returns data
 235     in text format (as opposed to a CSV file with an HTML header). You can see
 236     what the response looks like by checking your JS console or by requesting
 237     the URL yourself.</p>
 238
 239
 240     <h3>Array (native format)</h3>
 241     <p>If you'll be constructing your data set from a server-side program (or
 242     from JavaScript) then you're better off producing an array than CSV data.
 243     This saves the cost of parsing the CSV data and also avoids common parser
 244     errors.</p>
 245
 246     <p>The downside is that it's harder to look at your data (you'll need to use
 247     a JS debugger) and that the data format is a bit less clear for values with
 248     uncertainties.</p>
 249
 250
 251     Array
 252     - disclaimers
 253     - Dates on the x-axis
 254     - how to specify fractions
 255     - how to specify missing data
 256     - how to specify value + std. dev.
 257     - how to specify [low, middle, high]
 258
 259     Functions
 260     - make sure they work as expected:
 261         function() { return x; }
 262       is identical as a source to "x".
 263
 264     DataTable
 265     - Links to relevant gviz docs
 266     - When to use Dygraph.GvizWrapper
 267     - how to specify fractions
 268     - how to specify missing data
 269     - how to specify value + std. dev.
 270     - how to specify [low, middle, high]
 271     - walkthrough of embedding a gadget in google docs/on a web page
 272     - walkthrough of using std. dev. in a spreadsheet chart
 273
 274   </body>
 275 </html>