continue docs
[dygraphs.git] / docs / data.html
1 <html>
2 <head>
3 <title>dygraphs input types</title>
4 <style type="text/css">
5 code { white-space: pre; border: 1px dashed black; display: block; }
6 pre { white-space: pre; border: 1px dashed black; }
7 body { max-width: 800px; }
8 </style>
9 </head>
10 <body>
11 <h2>dygraphs Data Format</h2>
12
13 <p>When you create a Dygraph object, your code looks something like
14 this:</p>
15
16 <code>
17 g = new Dygraph(document.getElementById("div"),
18 <i>data</i>,
19 { <i>options</i> });
20 </code>
21
22 <p>This document is about what you can put in the <i>data</i>
23 parameter.</p>
24
25 <p>There are five types of input that dygraphs will accept:</p>
26 <ol>
27 <li><a href="#csv">CSV data</a>
28 <li><a href="#url">URL</a>
29 <li><a href="#array">array (native format)</a>
30 <li><a href="#function">function</a>
31 <li><a href="#datatable">DataTable</a>
32 </ol>
33
34 <p>These are all discussed below. If you're trying to debug why your input
35 won't parse, <b>check the JS error console</b>. dygraphs tries to log
36 informative errors explaining what's wrong with your data, and these can
37 often point you in the right direction.</p>
38
39 <p>There are several options which affect how your input data is
40 interpreted. These are:
41 <ul>
42 <li> <i>xValueParser</i> affects CSV only.
43 <li> <i>errorBars</i> affects all input types.
44 <li> <i>customBars</i> affects all input types.
45 <li> <i>fractions</i> affects all input types.
46 <li> <i>labels</i> affects all input types.
47 </ul>
48 </p>
49
50 <a name="csv"><h3>CSV</h3>
51 <p>Here's an example of what CSV data should look like:</p>
52 <pre>
53 Date,Series1,Series2
54 2009/07/12,100,200 # comments are OK on data lines
55 2009/07/19,150,201
56 </pre>
57
58 <p>"CSV" is actually a bit of a misnomer: the data can be tab-delimited,
59 too. The delimiter is set by the <i>delimiter</i> option. It default to ",".
60 If no delimiter is found in the first row, it switches over to tab.</p>
61
62 <p>CSV parsing can be split into three parts: headers, x-value and
63 y-values.</p>
64
65 <h4>Headers</h4>
66 <p>If you don't specify the <i>labels</i> option, dygraphs will look at the
67 first line of your CSV data to get the labels. If you see numbers for series
68 labels when you hover over the dygraph, it's likely because your first line
69 contains data but is being parsed as a label. The solution is to either add
70 a header line or specify the labels like this:</p>
71
72 <code>
73 new Dygraph(el,
74 "2009/07/12,100,200\n" +
75 "2009/07/19,150,201\n",
76 { labels: [ "Date", "Series1", "Series2" ] });
77 </code>
78
79 <h4>x-values</h4>
80 <p>Once the headers are parsed, dygraphs needs to determine what the type of
81 the x values is. They're either dates or numbers. To make this
82 determination, it looks at the first column of the first row ("2009/07/12"
83 in the example above). Here's the heuristic: if it contains a '-' or a '/',
84 or otherwise doesn't parse as a float, the it's a date. Otherwise, it's a
85 number.</p>
86
87 <p>Once the type is determined, that doesn't mean all the values will parse
88 correctly. The general rule is:<p>
89
90 <ul>
91 <li>For dates, your strings have to be parseable by <i>Date.parse</i>.
92 <li>For numbers, your strings have to be parseable by <i>parseFloat</i>.
93 </ul>
94
95 <p>You can manually verify this using a JavaScript console. If a value
96 doesn't parse, dygraphs will put a warning about it on your console. But
97 beware: different browsers support different date formats!</p>
98
99 <p>Here are some valid date formats:</p>
100 <ul>
101 <li>2009-07-12</li>
102 <li>2009/07/12</li>
103 <li>2009/07/12 12</li>
104 <li>2009/07/12 12:34</li>
105 <li>2009/07/12 12:34:56</li>
106 </ul>
107
108 <p>If you specify the <i>xValueParser</i> option, then all this detection is
109 bypassed and your function is called instead. Your parser function takes in
110 a string and needs to return a number. For dates/times, you should return
111 milliseconds since epoch. You may also want to specify a few other options
112 to make sure that everything gets displayed properly.<p>
113
114 <p>Here's code which parses a CSV file with unix timestamps in the first
115 column:</p>
116
117 <code>
118 new Dygraph(el,
119 "Date,Series1,Series2\n" +
120 "1247382000,100,200\n" +
121 "1247986800,150,201\n",
122 {
123 xValueFormatter: Dygraph.dateString_,
124 xValueParser: function(x) { return 1000*parseInt(x); },
125 xTicker: Dygraph.dateTicker
126 });
127 </code>
128
129 <h4>y-values</h4>
130 <p>Dependent (y-axis) values are simpler than x-values because they're
131 always numbers. The complexity here comes from the various ways that you can
132 specify the uncertainty in your measurements.<p>
133
134 <p>If your y-values are just numbers, then they need to be parseable by
135 JavaScript's parseFloat function. Acceptable formats include:</p>
136
137 <ul>
138 <li>12
139 <li>-12
140 <li>12.
141 <li>12.3
142 <li>1.24e+1
143 <li>-1.24e+1
144 </ul>
145
146 <p>If you have missing data, just leave the column blank (your CSV file will
147 probably contain a ",," in it).</p>
148
149 <p>If your numbers have uncertainty associated with them, then there are
150 three basic ways to express this: using fractions, standard deviations or
151 explicit ranges.</p>
152
153 <h5>Fractions</h5>
154 <p>If you specify the <i>fractions</i> option, then your data will all be
155 interpreted as ratios between zero and one. This is often the case if you're
156 plotting a percentage.</p>
157
158 <code>
159 new Dygraph(el,
160 "X,Frac1,Frac2\n" +
161 "1,1/2,3/4\n"+
162 "2,1/3,2/3\n"+
163 "3,2/3,17/49\n"+
164 "4,25/30,100/200",
165 { fractions: true });
166 </code>
167
168 <p>Why not just divide the fractions out yourself? There are two attractive
169 reasons not to:</p>
170
171 <ul>
172 <li>If you set both <i>fractions</i> and <i>errorBars</i>, then the
173 denominator is interpreted as a sample size and dygraphs will plot <a
174 href="http://en.wikipedia.org/wiki/Binomial_proportion_confidence_interval">Wilson
175 binomial proportion confidence intervals</a> around each point.
176
177 <li>If you set <i>showRoller</i>, then dygraphs will combine the values as
178 fractions. If two point are <i>a/b</i> and <i>c/d</i>, it will plot
179 <i>(a+b) / (c+d)</i> rather than <i>(a/b + c/d) / 2</i>, which is what
180 you'd get if you divided the fractions through. This will also shrink the
181 confidence intervals.</li>
182 </ul>
183
184 <h5>Standard Deviations</h5>
185 <p>Often you have a measurement and also a measure of its uncertainty: a
186 standard deviation. If you specify the <i>errorBars</i> option, dygraphs
187 will look for alternating value and standard deviation columns in your CSV
188 data. Here's what it should look like:</p>
189
190 <code>
191 new Dygraph(el,
192 "X,Y1,Y2\n" +
193 "1,10,5,20,5\n" +
194 "2,12,5,22,5\n",
195 { errorBars: true });
196 </code>
197
198 <p>The "5" values are standard deviations. When each point is plotted, a
199 2-standard deviation region around it is shaded, resulting in a 95%
200 confidence interval. If you want more or less confidence, you can set the
201 <i>sigma</i> option to something other than 2.0.</p>
202
203 <p>When you roll data with standard deviations, dygraphs will plot the
204 average of your values in each rolling period and the RMS value of your
205 standard deviations: sqrt(std1 + std2 + std3 + ... + stdN)/N.</p>
206
207 <h5>Custom error bars</h5>
208 <p>Sometimes your data has asymetric uncertainty or you want to specify
209 something else with the error bars around a point. One example of this is
210 the "temperatures" demo on the <a href="http://danvk.org/dygraphs">dygraphs
211 home page.</a>, where the point is the daily average and the bars denote
212 the low and high temperatures for the day.</p>
213
214 <p>To specify this format, set the <i>customBars</i> option. Your CSV values
215 should each be three numbers separated by semicolons ("low;mid;high").
216 Here's an example:</p>
217
218 <code>
219 new Dygraph(el,
220 "X,Y1,Y2\n" +
221 "1,10;20;30,20;5;25\n" +
222 "2,10;25;35,20;10;25\n",
223 { customBars: true });
224 </code>
225
226 <p>The middle value need not lie between the low and high values. If you set
227 a rolling period, the three values will all be averaged independently.</p>
228
229
230 <a name="url"><h3>URL</h3>
231 <p>If you pass in a URL, dygraphs will issue an XMLHttpRequest for it and
232 attempt to parse the returned data as CSV.
233 </p>
234
235 <p><i>Common problems</i>. Make sure the URL is accessible and returns data
236 in text format (as opposed to a CSV file with an HTML header). You can see
237 what the response looks like by checking your JS console or by requesting
238 the URL yourself.</p>
239
240
241 <a name="array"><h3>Array (native format)</h3>
242 <p>If you'll be constructing your data set from a server-side program (or
243 from JavaScript) then you're better off producing an array than CSV data.
244 This saves the cost of parsing the CSV data and also avoids common parser
245 errors.</p>
246
247 <p>The downside is that it's harder to look at your data (you'll need to use
248 a JS debugger) and that the data format is a bit less clear for values with
249 uncertainties.</p>
250
251 <p>Here's an example of "native format":</p>
252
253 <code>
254 new Dygraph(document.getElementById("graphdiv2"),
255 [
256 [1,10,100],
257 [2,20,80],
258 [3,50,60],
259 [4,70,80]
260 ],
261 {
262 labels: [ "x", "A", "B" ]
263 });
264 </code>
265
266 <h4>Headers</h4>
267 <p>Headers for native format must be specified via the <i>labels</i>
268 option. There's no other way to set them.</p>
269
270 <h4>x-values</h4>
271 <p>If you want your x-values to be dates, you'll need to use specify a Date
272 object in the first column. Otherwise, specify a number. Here's a sample
273 array with dates on the x-axis:</p>
274
275 <code>
276 [
277 [ new Date("2009/07/12"), 100, 200 ],
278 [ new Date("2009/07/19"), 150, 220 ]
279 ]
280 </code>
281
282 <h4>y-values</h4>
283 <p>You can specify <i>errorBars</i>, <i>fractions</i> or <i>customBars</i>
284 with the array format. If you specify any of these, the values become arrays
285 (rather than numbers). Here's what the format looks like for each one:</p>
286
287 <code>
288 <i>errorBars</i>: [x, [value1, std1], [value2, std2], ...]
289 <i>fractions</i>: [x, [num1, den1], [num2, den2], ...]
290 <i>customBars</i>: [x, [low1, val1, high1], [low2, val2, high2], ...]
291 </code>
292
293 <p>To specify missing data, set the value to null. You may not set a value
294 inside an array to null. Use null instead of the entire array.</p>
295
296 <a name="function"><h3>Functions</h3>
297
298 <p>You can specify a function that returns any of the other types. If
299 <i>x</i> is a valid piece of dygraphs input, then so is</p>
300
301 <code>
302 function() { return x; }
303 </code>
304
305 <a name="datatable"><h3>DataTable</h3>
306 <p>You can also specify a Google Visualization Library <a
307 href="http://code.google.com/apis/visualization/documentation/reference.html#DataTable">DataTable</a>
308 object as your input data. This lets you easily switch between dygraphs and
309 other gviz visualizations such as the Annotated Timeline. It also lets you
310 embed a Dygraph in a Google Spreadsheet.</p>
311
312 <p>You'll need to set your first column's type to one of "number", "date"
313 or "datetime".</p>
314
315 <pre>
316 DataTable TODO:
317 - When to use Dygraph.GvizWrapper
318 - how to specify fractions
319 - how to specify missing data
320 - how to specify value + std. dev.
321 - how to specify [low, middle, high]
322 - walkthrough of embedding a gadget in google docs/on a web page
323 - walkthrough of using std. dev. in a spreadsheet chart
324 </pre>
325
326 </body>
327 </html>