Merge pull request #489 from danvk/combined-dev
[dygraphs.git] / docs / data.html
CommitLineData
14403441
DV
1<!--#include virtual="header.html" -->
2
3<style type="text/css">
4 code { white-space: pre; border: 1px dashed black; display: block; }
5 pre { white-space: pre; border: 1px dashed black; }
6</style>
7
8<h2>dygraphs Data Format</h2>
9
10<p>When you create a Dygraph object, your code looks something like
11this:</p>
12
13<code>
14 g = new Dygraph(document.getElementById("div"),
15 <i>data</i>,
16 { <i>options</i> });
17
18</code>
19
20<p>This document is about what you can put in the <i>data</i>
21parameter.</p>
22
23<p>There are five types of input that dygraphs will accept:</p>
24<ol>
25 <li><a href="#csv">CSV data</a>
26 <li><a href="#url">URL</a>
27 <li><a href="#array">array (native format)</a>
28 <li><a href="#function">function</a>
29 <li><a href="#datatable">DataTable</a>
30</ol>
31
32<p>These are all discussed below. If you're trying to debug why your input
33won't parse, <b>check the JS error console</b>. dygraphs tries to log
34informative errors explaining what's wrong with your data, and these can
35often point you in the right direction.</p>
36
37<p>There are several options which affect how your input data is
38interpreted. These are:</p>
39<ul>
40 <li> <i>xValueParser</i> affects CSV only.
41 <li> <i>errorBars</i> affects all input types.
42 <li> <i>customBars</i> affects all input types.
43 <li> <i>fractions</i> affects all input types.
44 <li> <i>labels</i> affects all input types.
45</ul>
46
47<a name="csv"></a>
48 <h3>CSV</h3>
49<p>Here's an example of what CSV data should look like:</p>
50<pre>
51Date,Series1,Series2
522009/07/12,100,200 # comments are OK on data lines
532009/07/19,150,201
54</pre>
55
56<p>"CSV" is actually a bit of a misnomer: the data can be tab-delimited,
57too. The delimiter is set by the <i>delimiter</i> option. It default to ",".
58If no delimiter is found in the first row, it switches over to tab.</p>
59
60<p>CSV parsing can be split into three parts: headers, x-value and
61y-values.</p>
62
63<h4>Headers</h4>
64<p>If you don't specify the <i>labels</i> option, dygraphs will look at the
65first line of your CSV data to get the labels. If you see numbers for series
66labels when you hover over the dygraph, it's likely because your first line
67contains data but is being parsed as a label. The solution is to either add
68a header line or specify the labels like this:</p>
69
70<code>
71 new Dygraph(el,
72 "2009/07/12,100,200\n" +
73 "2009/07/19,150,201\n",
74 { labels: [ "Date", "Series1", "Series2" ] });
75</code>
76
77<h4>x-values</h4>
78<p>Once the headers are parsed, dygraphs needs to determine what the type of
79the x values is. They're either dates or numbers. To make this
80determination, it looks at the first column of the first row ("2009/07/12"
81in the example above). Here's the heuristic: if it contains a '-' or a '/',
82or otherwise doesn't parse as a float, the it's a date. Otherwise, it's a
83number.</p>
84
85<p>Once the type is determined, that doesn't mean all the values will parse
86correctly. The general rule is:<p>
87
88<ul>
89 <li>For dates, your strings have to be parseable by <i>Date.parse</i>.
90 <li>For numbers, your strings have to be parseable by <i>parseFloat</i>.
91</ul>
92
93<p>You can manually verify this using a JavaScript console. If a value
94doesn't parse, dygraphs will put a warning about it on your console. But
95beware: different browsers support different date formats!</p>
96
97<p>Here are some valid date formats:</p>
98<ul>
99 <li>2009-07-12</li>
100 <li>2009/07/12</li>
101 <li>2009/07/12 12</li>
102 <li>2009/07/12 12:34</li>
103 <li>2009/07/12 12:34:56</li>
104</ul>
105
106<p>If you specify the <i>xValueParser</i> option, then all this detection is
107bypassed and your function is called instead. Your parser function takes in
108a string and needs to return a number. For dates/times, you should return
109milliseconds since epoch. You may also want to specify a few other options
110to make sure that everything gets displayed properly.<p>
111
112<p>Here's code which parses a CSV file with unix timestamps in the first
113column:</p>
114
115<code>
116 new Dygraph(el,
117 "Date,Series1,Series2\n" +
118 "1247382000,100,200\n" +
119 "1247986800,150,201\n",
120 {
121 xValueFormatter: Dygraph.dateString_,
122 xValueParser: function(x) { return 1000*parseInt(x); },
123 xTicker: Dygraph.dateTicker
124 });
125</code>
126
127<h4>y-values</h4>
128<p>Dependent (y-axis) values are simpler than x-values because they're
129always numbers. The complexity here comes from the various ways that you can
130specify the uncertainty in your measurements.<p>
131
132<p>If your y-values are just numbers, then they need to be parseable by
133JavaScript's parseFloat function. Acceptable formats include:</p>
134
135<ul>
136 <li>12
137 <li>-12
138 <li>12.
139 <li>12.3
140 <li>1.24e+1
141 <li>-1.24e+1
142</ul>
143
144<p>If you have missing data, just leave the column blank (your CSV file will
145probably contain a ",," in it).</p>
146
147<p>If your numbers have uncertainty associated with them, then there are
148three basic ways to express this: using fractions, standard deviations or
149explicit ranges.</p>
150
151<h5>Fractions</h5>
152<p>If you specify the <i>fractions</i> option, then your data will all be
153interpreted as ratios between zero and one. This is often the case if you're
154plotting a percentage.</p>
155
156<code>
157 new Dygraph(el,
158 "X,Frac1,Frac2\n" +
159 "1,1/2,3/4\n"+
160 "2,1/3,2/3\n"+
161 "3,2/3,17/49\n"+
162 "4,25/30,100/200",
163 { fractions: true });
164</code>
165
166<p>Why not just divide the fractions out yourself? There are two attractive
167reasons not to:</p>
168
169<ul>
170 <li>If you set both <i>fractions</i> and <i>errorBars</i>, then the
171denominator is interpreted as a sample size and dygraphs will plot <a
172 href="http://en.wikipedia.org/wiki/Binomial_proportion_confidence_interval">Wilson
173 binomial proportion confidence intervals</a> around each point.
174
175 <li>If you set <i>showRoller</i>, then dygraphs will combine the values as
176 fractions. If two point are <i>a/b</i> and <i>c/d</i>, it will plot
177 <i>(a+b) / (c+d)</i> rather than <i>(a/b + c/d) / 2</i>, which is what
178 you'd get if you divided the fractions through. This will also shrink the
179 confidence intervals.</li>
180</ul>
181
182<h5>Standard Deviations</h5>
183<p>Often you have a measurement and also a measure of its uncertainty: a
184standard deviation. If you specify the <i>errorBars</i> option, dygraphs
185will look for alternating value and standard deviation columns in your CSV
186data. Here's what it should look like:</p>
187
188<code>
189 new Dygraph(el,
190 "X,Y1,Y2\n" +
191 "1,10,5,20,5\n" +
192 "2,12,5,22,5\n",
193 { errorBars: true });
194</code>
195
196<p>The "5" values are standard deviations. When each point is plotted, a
1972-standard deviation region around it is shaded, resulting in a 95%
198confidence interval. If you want more or less confidence, you can set the
199<i>sigma</i> option to something other than 2.0.</p>
200
201<p>When you roll data with standard deviations, dygraphs will plot the
202average of your values in each rolling period and the RMS value of your
203standard deviations: sqrt(std1 + std2 + std3 + ... + stdN)/N.</p>
204
205<h5>Custom error bars</h5>
206<p>Sometimes your data has asymetric uncertainty or you want to specify
207something else with the error bars around a point. One example of this is
208the "temperatures" demo on the <a href="http://danvk.org/dygraphs">dygraphs
209 home page.</a>, where the point is the daily average and the bars denote
210the low and high temperatures for the day.</p>
211
212<p>To specify this format, set the <i>customBars</i> option. Your CSV values
213should each be three numbers separated by semicolons ("low;mid;high").
214Here's an example:</p>
215
216<code>
217 new Dygraph(el,
218 "X,Y1,Y2\n" +
219 "1,10;20;30,20;5;25\n" +
220 "2,10;25;35,20;10;25\n",
221 { customBars: true });
222</code>
223
224<p>The middle value need not lie between the low and high values. If you set
225a rolling period, the three values will all be averaged independently.</p>
226
227
228<a name="url"></a>
229<h3>URL</h3>
230<p>If you pass in a URL, dygraphs will issue an XMLHttpRequest for it and
231attempt to parse the returned data as CSV.
232</p>
233
234<p><i>Common problems</i>. Make sure the URL is accessible and returns data
235in text format (as opposed to a CSV file with an HTML header). You can see
236what the response looks like by checking your JS console or by requesting
237the URL yourself.</p>
238
239
240<a name="array"></a>
241<h3>Array (native format)</h3>
242<p>If you'll be constructing your data set from a server-side program (or
243from JavaScript) then you're better off producing an array than CSV data.
244This saves the cost of parsing the CSV data and also avoids common parser
245errors.</p>
246
247<p>The downside is that it's harder to look at your data (you'll need to use
248a JS debugger) and that the data format is a bit less clear for values with
249uncertainties.</p>
250
251<p>Here's an example of "native format":</p>
252
253<code>
254 new Dygraph(document.getElementById("graphdiv2"),
255 [
256 [1,10,100],
257 [2,20,80],
258 [3,50,60],
259 [4,70,80]
260 ],
261 {
262 labels: [ "x", "A", "B" ]
263 });
264</code>
265
266<h4>Headers</h4>
267<p>Headers for native format must be specified via the <i>labels</i>
268option. There's no other way to set them.</p>
269
270<h4>x-values</h4>
271<p>If you want your x-values to be dates, you'll need to use specify a Date
272object in the first column. Otherwise, specify a number. Here's a sample
273array with dates on the x-axis:</p>
274
275<code>
276 [
277 [ new Date("2009/07/12"), 100, 200 ],
278 [ new Date("2009/07/19"), 150, 220 ]
279 ]
280</code>
281
282<h4>y-values</h4>
283<p>You can specify <i>errorBars</i>, <i>fractions</i> or <i>customBars</i>
284with the array format. If you specify any of these, the values become arrays
285(rather than numbers). Here's what the format looks like for each one:</p>
286
287<code>
288 <i>errorBars</i>: [x, [value1, std1], [value2, std2], ...]
289 <i>fractions</i>: [x, [num1, den1], [num2, den2], ...]
290 <i>customBars</i>: [x, [low1, val1, high1], [low2, val2, high2], ...]
291</code>
292
293<p>To specify missing data, set the value to null or NaN. You may not set a value
294inside an array to null or NaN. Use null or NaN instead of the entire array.
295The only difference between the two is when the option
296<a href="options.html#conectSeparatedPoints">connectSeparatedPoints</a>
297true. In that case, the gaps created by nulls are filled in, and gaps
298created by NaNs are preserved.
299</p>
300
301<a name="function"></a>
302<h3>Functions</h3>
303
304<p>You can specify a function that returns any of the other types. If
305<i>x</i> is a valid piece of dygraphs input, then so is</p>
306
307<code>
308 function() { return x; }
309</code>
310
311Functions can return strings, arrays, data tables, URLs, or any other data type.
312
313<a name="datatable"></a>
314<h3>DataTable</h3>
315<p>You can also specify a Google Visualization Library <a
316 href="http://code.google.com/apis/visualization/documentation/reference.html#DataTable">DataTable</a>
317object as your input data. This lets you easily switch between dygraphs and
318other gviz visualizations such as the Annotated Timeline. It also lets you
319embed a Dygraph in a Google Spreadsheet.</p>
320
321<p>You'll need to set your first column's type to one of "number", "date"
322or "datetime".</p>
323
324<pre>
325DataTable TODO:
326- When to use Dygraph.GvizWrapper
327- how to specify fractions
328- how to specify missing data
329- how to specify value + std. dev.
330- how to specify [low, middle, high]
331- walkthrough of embedding a gadget in google docs/on a web page
332- walkthrough of using std. dev. in a spreadsheet chart
333</pre>
334
335<!--#include virtual="footer.html" -->