add HTML5 doctype to docs pages
[dygraphs.git] / docs / data.html
1 <!DOCTYPE html>
2 <html>
3 <head>
4 <title>dygraphs input types</title>
5 <style type="text/css">
6 code { white-space: pre; border: 1px dashed black; display: block; }
7 pre { white-space: pre; border: 1px dashed black; }
8 body { max-width: 800px; }
9 </style>
10 </head>
11 <body>
12 <h2>dygraphs Data Format</h2>
13
14 <p>When you create a Dygraph object, your code looks something like
15 this:</p>
16
17 <code>
18 g = new Dygraph(document.getElementById("div"),
19 <i>data</i>,
20 { <i>options</i> });
21 </code>
22
23 <p>This document is about what you can put in the <i>data</i>
24 parameter.</p>
25
26 <p>There are five types of input that dygraphs will accept:</p>
27 <ol>
28 <li><a href="#csv">CSV data</a>
29 <li><a href="#url">URL</a>
30 <li><a href="#array">array (native format)</a>
31 <li><a href="#function">function</a>
32 <li><a href="#datatable">DataTable</a>
33 </ol>
34
35 <p>These are all discussed below. If you're trying to debug why your input
36 won't parse, <b>check the JS error console</b>. dygraphs tries to log
37 informative errors explaining what's wrong with your data, and these can
38 often point you in the right direction.</p>
39
40 <p>There are several options which affect how your input data is
41 interpreted. These are:
42 <ul>
43 <li> <i>xValueParser</i> affects CSV only.
44 <li> <i>errorBars</i> affects all input types.
45 <li> <i>customBars</i> affects all input types.
46 <li> <i>fractions</i> affects all input types.
47 <li> <i>labels</i> affects all input types.
48 </ul>
49 </p>
50
51 <a name="csv"><h3>CSV</h3>
52 <p>Here's an example of what CSV data should look like:</p>
53 <pre>
54 Date,Series1,Series2
55 2009/07/12,100,200 # comments are OK on data lines
56 2009/07/19,150,201
57 </pre>
58
59 <p>"CSV" is actually a bit of a misnomer: the data can be tab-delimited,
60 too. The delimiter is set by the <i>delimiter</i> option. It default to ",".
61 If no delimiter is found in the first row, it switches over to tab.</p>
62
63 <p>CSV parsing can be split into three parts: headers, x-value and
64 y-values.</p>
65
66 <h4>Headers</h4>
67 <p>If you don't specify the <i>labels</i> option, dygraphs will look at the
68 first line of your CSV data to get the labels. If you see numbers for series
69 labels when you hover over the dygraph, it's likely because your first line
70 contains data but is being parsed as a label. The solution is to either add
71 a header line or specify the labels like this:</p>
72
73 <code>
74 new Dygraph(el,
75 "2009/07/12,100,200\n" +
76 "2009/07/19,150,201\n",
77 { labels: [ "Date", "Series1", "Series2" ] });
78 </code>
79
80 <h4>x-values</h4>
81 <p>Once the headers are parsed, dygraphs needs to determine what the type of
82 the x values is. They're either dates or numbers. To make this
83 determination, it looks at the first column of the first row ("2009/07/12"
84 in the example above). Here's the heuristic: if it contains a '-' or a '/',
85 or otherwise doesn't parse as a float, the it's a date. Otherwise, it's a
86 number.</p>
87
88 <p>Once the type is determined, that doesn't mean all the values will parse
89 correctly. The general rule is:<p>
90
91 <ul>
92 <li>For dates, your strings have to be parseable by <i>Date.parse</i>.
93 <li>For numbers, your strings have to be parseable by <i>parseFloat</i>.
94 </ul>
95
96 <p>You can manually verify this using a JavaScript console. If a value
97 doesn't parse, dygraphs will put a warning about it on your console. But
98 beware: different browsers support different date formats!</p>
99
100 <p>Here are some valid date formats:</p>
101 <ul>
102 <li>2009-07-12</li>
103 <li>2009/07/12</li>
104 <li>2009/07/12 12</li>
105 <li>2009/07/12 12:34</li>
106 <li>2009/07/12 12:34:56</li>
107 </ul>
108
109 <p>If you specify the <i>xValueParser</i> option, then all this detection is
110 bypassed and your function is called instead. Your parser function takes in
111 a string and needs to return a number. For dates/times, you should return
112 milliseconds since epoch. You may also want to specify a few other options
113 to make sure that everything gets displayed properly.<p>
114
115 <p>Here's code which parses a CSV file with unix timestamps in the first
116 column:</p>
117
118 <code>
119 new Dygraph(el,
120 "Date,Series1,Series2\n" +
121 "1247382000,100,200\n" +
122 "1247986800,150,201\n",
123 {
124 xValueFormatter: Dygraph.dateString_,
125 xValueParser: function(x) { return 1000*parseInt(x); },
126 xTicker: Dygraph.dateTicker
127 });
128 </code>
129
130 <h4>y-values</h4>
131 <p>Dependent (y-axis) values are simpler than x-values because they're
132 always numbers. The complexity here comes from the various ways that you can
133 specify the uncertainty in your measurements.<p>
134
135 <p>If your y-values are just numbers, then they need to be parseable by
136 JavaScript's parseFloat function. Acceptable formats include:</p>
137
138 <ul>
139 <li>12
140 <li>-12
141 <li>12.
142 <li>12.3
143 <li>1.24e+1
144 <li>-1.24e+1
145 </ul>
146
147 <p>If you have missing data, just leave the column blank (your CSV file will
148 probably contain a ",," in it).</p>
149
150 <p>If your numbers have uncertainty associated with them, then there are
151 three basic ways to express this: using fractions, standard deviations or
152 explicit ranges.</p>
153
154 <h5>Fractions</h5>
155 <p>If you specify the <i>fractions</i> option, then your data will all be
156 interpreted as ratios between zero and one. This is often the case if you're
157 plotting a percentage.</p>
158
159 <code>
160 new Dygraph(el,
161 "X,Frac1,Frac2\n" +
162 "1,1/2,3/4\n"+
163 "2,1/3,2/3\n"+
164 "3,2/3,17/49\n"+
165 "4,25/30,100/200",
166 { fractions: true });
167 </code>
168
169 <p>Why not just divide the fractions out yourself? There are two attractive
170 reasons not to:</p>
171
172 <ul>
173 <li>If you set both <i>fractions</i> and <i>errorBars</i>, then the
174 denominator is interpreted as a sample size and dygraphs will plot <a
175 href="http://en.wikipedia.org/wiki/Binomial_proportion_confidence_interval">Wilson
176 binomial proportion confidence intervals</a> around each point.
177
178 <li>If you set <i>showRoller</i>, then dygraphs will combine the values as
179 fractions. If two point are <i>a/b</i> and <i>c/d</i>, it will plot
180 <i>(a+b) / (c+d)</i> rather than <i>(a/b + c/d) / 2</i>, which is what
181 you'd get if you divided the fractions through. This will also shrink the
182 confidence intervals.</li>
183 </ul>
184
185 <h5>Standard Deviations</h5>
186 <p>Often you have a measurement and also a measure of its uncertainty: a
187 standard deviation. If you specify the <i>errorBars</i> option, dygraphs
188 will look for alternating value and standard deviation columns in your CSV
189 data. Here's what it should look like:</p>
190
191 <code>
192 new Dygraph(el,
193 "X,Y1,Y2\n" +
194 "1,10,5,20,5\n" +
195 "2,12,5,22,5\n",
196 { errorBars: true });
197 </code>
198
199 <p>The "5" values are standard deviations. When each point is plotted, a
200 2-standard deviation region around it is shaded, resulting in a 95%
201 confidence interval. If you want more or less confidence, you can set the
202 <i>sigma</i> option to something other than 2.0.</p>
203
204 <p>When you roll data with standard deviations, dygraphs will plot the
205 average of your values in each rolling period and the RMS value of your
206 standard deviations: sqrt(std1 + std2 + std3 + ... + stdN)/N.</p>
207
208 <h5>Custom error bars</h5>
209 <p>Sometimes your data has asymetric uncertainty or you want to specify
210 something else with the error bars around a point. One example of this is
211 the "temperatures" demo on the <a href="http://danvk.org/dygraphs">dygraphs
212 home page.</a>, where the point is the daily average and the bars denote
213 the low and high temperatures for the day.</p>
214
215 <p>To specify this format, set the <i>customBars</i> option. Your CSV values
216 should each be three numbers separated by semicolons ("low;mid;high").
217 Here's an example:</p>
218
219 <code>
220 new Dygraph(el,
221 "X,Y1,Y2\n" +
222 "1,10;20;30,20;5;25\n" +
223 "2,10;25;35,20;10;25\n",
224 { customBars: true });
225 </code>
226
227 <p>The middle value need not lie between the low and high values. If you set
228 a rolling period, the three values will all be averaged independently.</p>
229
230
231 <a name="url"><h3>URL</h3>
232 <p>If you pass in a URL, dygraphs will issue an XMLHttpRequest for it and
233 attempt to parse the returned data as CSV.
234 </p>
235
236 <p><i>Common problems</i>. Make sure the URL is accessible and returns data
237 in text format (as opposed to a CSV file with an HTML header). You can see
238 what the response looks like by checking your JS console or by requesting
239 the URL yourself.</p>
240
241
242 <a name="array"><h3>Array (native format)</h3>
243 <p>If you'll be constructing your data set from a server-side program (or
244 from JavaScript) then you're better off producing an array than CSV data.
245 This saves the cost of parsing the CSV data and also avoids common parser
246 errors.</p>
247
248 <p>The downside is that it's harder to look at your data (you'll need to use
249 a JS debugger) and that the data format is a bit less clear for values with
250 uncertainties.</p>
251
252 <p>Here's an example of "native format":</p>
253
254 <code>
255 new Dygraph(document.getElementById("graphdiv2"),
256 [
257 [1,10,100],
258 [2,20,80],
259 [3,50,60],
260 [4,70,80]
261 ],
262 {
263 labels: [ "x", "A", "B" ]
264 });
265 </code>
266
267 <h4>Headers</h4>
268 <p>Headers for native format must be specified via the <i>labels</i>
269 option. There's no other way to set them.</p>
270
271 <h4>x-values</h4>
272 <p>If you want your x-values to be dates, you'll need to use specify a Date
273 object in the first column. Otherwise, specify a number. Here's a sample
274 array with dates on the x-axis:</p>
275
276 <code>
277 [
278 [ new Date("2009/07/12"), 100, 200 ],
279 [ new Date("2009/07/19"), 150, 220 ]
280 ]
281 </code>
282
283 <h4>y-values</h4>
284 <p>You can specify <i>errorBars</i>, <i>fractions</i> or <i>customBars</i>
285 with the array format. If you specify any of these, the values become arrays
286 (rather than numbers). Here's what the format looks like for each one:</p>
287
288 <code>
289 <i>errorBars</i>: [x, [value1, std1], [value2, std2], ...]
290 <i>fractions</i>: [x, [num1, den1], [num2, den2], ...]
291 <i>customBars</i>: [x, [low1, val1, high1], [low2, val2, high2], ...]
292 </code>
293
294 <p>To specify missing data, set the value to null. You may not set a value
295 inside an array to null. Use null instead of the entire array.</p>
296
297 <a name="function"><h3>Functions</h3>
298
299 <p>You can specify a function that returns any of the other types. If
300 <i>x</i> is a valid piece of dygraphs input, then so is</p>
301
302 <code>
303 function() { return x; }
304 </code>
305
306 <a name="datatable"><h3>DataTable</h3>
307 <p>You can also specify a Google Visualization Library <a
308 href="http://code.google.com/apis/visualization/documentation/reference.html#DataTable">DataTable</a>
309 object as your input data. This lets you easily switch between dygraphs and
310 other gviz visualizations such as the Annotated Timeline. It also lets you
311 embed a Dygraph in a Google Spreadsheet.</p>
312
313 <p>You'll need to set your first column's type to one of "number", "date"
314 or "datetime".</p>
315
316 <pre>
317 DataTable TODO:
318 - When to use Dygraph.GvizWrapper
319 - how to specify fractions
320 - how to specify missing data
321 - how to specify value + std. dev.
322 - how to specify [low, middle, high]
323 - walkthrough of embedding a gadget in google docs/on a web page
324 - walkthrough of using std. dev. in a spreadsheet chart
325 </pre>
326
327 </body>
328 </html>