start data.html
[dygraphs.git] / docs / data.html
1 <html>
2 <head>
3 <title>dygraphs input types</title>
4 <style type="text/css">
5 code { white-space: pre; }
6 pre { white-space: pre; }
7 </style>
8 </head>
9 <body>
10 <h2>dygraphs Data Format</h2>
11
12 <p>When you create a Dygraph object, your code looks something like
13 this:</p>
14
15 <code>
16 g = new Dygraph(document.getElementById("div"),
17 <i>data</i>,
18 { <i>options</i> });
19 </code>
20
21 <p>This document is about what you can put in the <i>data</i>
22 parameter.</p>
23
24 <p>There are five types of input that dygraphs will accept:</p>
25 <ol>
26 <li>CSV data
27 <li>URL
28 <li>array (native format)
29 <li>function
30 <li>DataTable
31 </ol>
32
33 <p>These are all discussed below. If you're trying to debug why your input
34 won't parse, <b>check the JS error console</b>. dygraphs tries to log
35 informative errors explaining what's wrong with your data, and these can
36 often point you in the right direction.</p>
37
38 <p>There are several options which affect how your input data is
39 interpreted. These are:
40 <ul>
41 <li> <i>xValueParser</i> affects CSV only.
42 <li> <i>errorBars</i> affects all input types.
43 <li> <i>customBars</i> affects all input types.
44 <li> <i>fractions</i> affects all input types.
45 <li> <i>labels</i> affects all input types.
46 </ul>
47 </p>
48
49 <h3>CSV</h3>
50 <p>Here's an example of what CSV data should look like:</p>
51 <pre>
52 Date,Series1,Series2
53 2009/07/12,100,200 # comments are OK on data lines
54 2009/07/19,150,201
55 </pre>
56
57 <p>"CSV" is actually a bit of a misnomer: the data can be tab-delimited,
58 too. The delimiter is set by the <i>delimiter</i> option. It default to ",".
59 If no delimiter is found in the first row, it switches over to tab.</p>
60
61 <p>CSV parsing can be split into three parts: headers, x-value and
62 y-values.</p>
63
64 <h4>Headers</h4>
65 <p>If you don't specify the <i>labels</i> option, dygraphs will look at the
66 first line of your CSV data to get the labels. If you see numbers for series
67 labels when you hover over the dygraph, it's likely because your first line
68 contains data but is being parsed as a label. The solution is to either add
69 a header line or specify the labels like this:</p>
70
71 <code>
72 new Dygraph(el,
73 "2009/07/12,100,200\n" +
74 "2009/07/19,150,201\n",
75 { labels: [ "Date", "Series1", "Series2" ] });
76 </code>
77
78 <h4>x-values</h4>
79 <p>Once the headers are parsed, dygraphs needs to determine what the type of
80 the x values is. They're either dates or numbers. To make this
81 determination, it looks at the first column of the first row ("2009/07/12"
82 in the example above). Here's the heuristic: if it contains a '-' or a '/',
83 or otherwise doesn't parse as a float, the it's a date. Otherwise, it's a
84 number.</p>
85
86 <p>Once the type is determined, that doesn't mean all the values will parse
87 correctly. The general rule is:<p>
88
89 <ul>
90 <li>For dates, your strings have to be parseable by <i>Date.parse</i>.
91 <li>For numbers, your strings have to be parseable by <i>parseFloat</i>.
92 </ul>
93
94 <p>You can manually verify this using a JavaScript console. If a value
95 doesn't parse, dygraphs will put a warning about it on your console. But
96 beware: different browsers support different date formats!</p>
97
98 <p>Here are some valid date formats:</p>
99 <ul>
100 <li>2009-07-12</li>
101 <li>2009/07/12</li>
102 <li>2009/07/12 12</li>
103 <li>2009/07/12 12:34</li>
104 <li>2009/07/12 12:34:56</li>
105 </ul>
106
107 <p>If you specify the <i>xValueParser</i> option, then all this detection is
108 bypassed and your function is called instead. Your parser function takes in
109 a string and needs to return a number. For dates/times, you should return
110 milliseconds since epoch. You may also want to specify a few other options
111 to make sure that everything gets displayed properly.<p>
112
113 <p>Here's code which parses a CSV file with unix timestamps in the first
114 column:</p>
115
116 <code>
117 new Dygraph(el,
118 "Date,Series1,Series2\n" +
119 "1247382000,100,200\n" +
120 "1247986800,150,201\n",
121 {
122 xValueFormatter: Dygraph.dateString_,
123 xValueParser: function(x) { return 1000*parseInt(x); },
124 xTicker: Dygraph.dateTicker
125 });
126 </code>
127
128 <h4>y-values</h4>
129 <p>Dependent (y-axis) values are simpler than x-values because they're
130 always numbers. The complexity here comes from the various ways that you can
131 specify the uncertainty in your measurements.<p>
132
133 <p>If your y-values are just numbers, then they need to be parseable by
134 JavaScript's parseFloat function. Acceptable formats include:</p>
135
136 <ul>
137 <li>12
138 <li>-12
139 <li>12.
140 <li>12.3
141 <li>1.24e+1
142 <li>-1.24e+1
143 </ul>
144
145 <p>If you have missing data, just leave the column blank (your CSV file will
146 probably contain a ",," in it).</p>
147
148 <p>If your numbers have uncertainty associated with them, then there are
149 three basic ways to express this: using fractions, standard deviations or
150 explicit ranges.</p>
151
152 <h5>Fractions</h5>
153 <p>If you specify the <i>fractions</i> option, then your data will all be
154 interpreted as ratios between zero and one. This is often the case if you're
155 plotting a percentage.</p>
156
157 <code>
158 new Dygraph(el,
159 "X,Frac1,Frac2\n" +
160 "1,1/2,3/4\n"+
161 "2,1/3,2/3\n"+
162 "3,2/3,17/49\n"+
163 "4,25/30,100/200",
164 { fractions: true });
165 </code>
166
167 <p>Why not just divide the fractions out yourself? There are two attractive
168 reasons not to:</p>
169
170 <ul>
171 <li>If you set both <i>fractions</i> and <i>errorBars</i>, then the
172 denominator is interpreted as a sample size and dygraphs will plot <a
173 href="http://en.wikipedia.org/wiki/Binomial_proportion_confidence_interval">Wilson
174 binomial proportion confidence intervals</a> around each point.
175
176 <li>If you set <i>showRoller</i>, then dygraphs will combine the values as
177 fractions. If two point are <i>a/b</i> and <i>c/d</i>, it will plot
178 <i>(a+b) / (c+d)</i> rather than <i>(a/b + c/d) / 2</i>, which is what
179 you'd get if you divided the fractions through. This will also shrink the
180 confidence intervals.</li>
181 </ul>
182
183 <h5>Standard Deviations</h5>
184 <p>Often you have a measurement and also a measure of its uncertainty: a
185 standard deviation. If you specify the <i>errorBars</i> option, dygraphs
186 will look for alternating value and standard deviation columns in your CSV
187 data. Here's what it should look like:</p>
188
189 <code>
190 new Dygraph(el,
191 "X,Y1,Y2\n" +
192 "1,10,5,20,5\n" +
193 "2,12,5,22,5\n",
194 { errorBars: true });
195 </code>
196
197 <p>The "5" values are standard deviations. When each point is plotted, a
198 2-standard deviation region around it is shaded, resulting in a 95%
199 confidence interval. If you want more or less confidence, you can set the
200 <i>sigma</i> option to something other than 2.0.</p>
201
202 <p>When you roll data with standard deviations, dygraphs will plot the
203 average of your values in each rolling period and the RMS value of your
204 standard deviations: sqrt(std1 + std2 + std3 + ... + stdN)/N.</p>
205
206 <h5>Custom error bars</h5>
207 <p>Sometimes your data has asymetric uncertainty or you want to specify
208 something else with the error bars around a point. One example of this is
209 the "temperatures" demo on the <a href="http://danvk.org/dygraphs">dygraphs
210 home page.</a>, where the point is the daily average and the bars denote
211 the low and high temperatures for the day.</p>
212
213 <p>To specify this format, set the <i>customBars</i> option. Your CSV values
214 should each be three numbers separated by semicolons ("low;mid;high").
215 Here's an example:</p>
216
217 <code>
218 new Dygraph(el,
219 "X,Y1,Y2\n" +
220 "1,10;20;30,20;5;25\n" +
221 "2,10;25;35,20;10;25\n",
222 { customBars: true });
223 </code>
224
225 <p>The middle value need not lie between the low and high values. If you set
226 a rolling period, the three values will all be averaged independently.</p>
227
228
229 <h3>URL</h3>
230 <p>If you pass in a URL, dygraphs will issue an XMLHttpRequest for it and
231 attempt to parse the returned data as CSV.
232 </p>
233
234 <p><i>Common problems</i>. Make sure the URL is accessible and returns data
235 in text format (as opposed to a CSV file with an HTML header). You can see
236 what the response looks like by checking your JS console or by requesting
237 the URL yourself.</p>
238
239
240 <h3>Array (native format)</h3>
241 <p>If you'll be constructing your data set from a server-side program (or
242 from JavaScript) then you're better off producing an array than CSV data.
243 This saves the cost of parsing the CSV data and also avoids common parser
244 errors.</p>
245
246 <p>The downside is that it's harder to look at your data (you'll need to use
247 a JS debugger) and that the data format is a bit less clear for values with
248 uncertainties.</p>
249
250
251 Array
252 - disclaimers
253 - Dates on the x-axis
254 - how to specify fractions
255 - how to specify missing data
256 - how to specify value + std. dev.
257 - how to specify [low, middle, high]
258
259 Functions
260 - make sure they work as expected:
261 function() { return x; }
262 is identical as a source to "x".
263
264 DataTable
265 - Links to relevant gviz docs
266 - When to use Dygraph.GvizWrapper
267 - how to specify fractions
268 - how to specify missing data
269 - how to specify value + std. dev.
270 - how to specify [low, middle, high]
271 - walkthrough of embedding a gadget in google docs/on a web page
272 - walkthrough of using std. dev. in a spreadsheet chart
273
274 </body>
275 </html>