start data.html

[dygraphs.git] / docs / data.html
diff --git a/docs/data.html b/docs/data.html

new file mode 100644 (file)

index 0000000..2a96e94
--- /dev/null
+++ b/docs/data.html
@@ -0,0 +1,275 @@
+<html>
+  <head>
+    <title>dygraphs input types</title>
+    <style type="text/css">
+      code { white-space: pre; }
+      pre  { white-space: pre; }
+    </style>
+  </head>
+  <body>
+    <h2>dygraphs Data Format</h2>
+
+    <p>When you create a Dygraph object, your code looks something like
+    this:</p>
+
+    <code>
+      g = new Dygraph(document.getElementById("div"),
+                      <i>data</i>,
+                      { <i>options</i> });
+    </code>
+
+    <p>This document is about what you can put in the <i>data</i>
+    parameter.</p>
+
+    <p>There are five types of input that dygraphs will accept:</p>
+    <ol>
+      <li>CSV data
+      <li>URL
+      <li>array (native format)
+      <li>function
+      <li>DataTable
+    </ol>
+
+    <p>These are all discussed below. If you're trying to debug why your input
+    won't parse, <b>check the JS error console</b>. dygraphs tries to log
+    informative errors explaining what's wrong with your data, and these can
+    often point you in the right direction.</p>
+
+    <p>There are several options which affect how your input data is
+    interpreted. These are:
+    <ul>
+      <li> <i>xValueParser</i> affects CSV only.
+      <li> <i>errorBars</i> affects all input types.
+      <li> <i>customBars</i> affects all input types.
+      <li> <i>fractions</i> affects all input types.
+      <li> <i>labels</i> affects all input types.
+    </ul>
+    </p>
+
+    <h3>CSV</h3>
+    <p>Here's an example of what CSV data should look like:</p>
+    <pre>
+    Date,Series1,Series2
+    2009/07/12,100,200  # comments are OK on data lines
+    2009/07/19,150,201
+    </pre>
+
+    <p>"CSV" is actually a bit of a misnomer: the data can be tab-delimited,
+    too. The delimiter is set by the <i>delimiter</i> option. It default to ",".
+    If no delimiter is found in the first row, it switches over to tab.</p>
+
+    <p>CSV parsing can be split into three parts: headers, x-value and
+    y-values.</p>
+
+    <h4>Headers</h4>
+    <p>If you don't specify the <i>labels</i> option, dygraphs will look at the
+    first line of your CSV data to get the labels. If you see numbers for series
+    labels when you hover over the dygraph, it's likely because your first line
+    contains data but is being parsed as a label. The solution is to either add
+    a header line or specify the labels like this:</p>
+
+    <code>
+      new Dygraph(el,
+                  "2009/07/12,100,200\n" +
+                  "2009/07/19,150,201\n",
+                  { labels: [ "Date", "Series1", "Series2" ] });
+    </code>
+
+    <h4>x-values</h4>
+    <p>Once the headers are parsed, dygraphs needs to determine what the type of
+    the x values is. They're either dates or numbers. To make this
+    determination, it looks at the first column of the first row ("2009/07/12"
+    in the example above). Here's the heuristic: if it contains a '-' or a '/',
+    or otherwise doesn't parse as a float, the it's a date. Otherwise, it's a
+    number.</p>
+
+    <p>Once the type is determined, that doesn't mean all the values will parse
+    correctly. The general rule is:<p>
+
+    <ul>
+      <li>For dates, your strings have to be parseable by <i>Date.parse</i>.
+      <li>For numbers, your strings have to be parseable by <i>parseFloat</i>.
+    </ul>
+
+    <p>You can manually verify this using a JavaScript console. If a value
+    doesn't parse, dygraphs will put a warning about it on your console. But
+    beware: different browsers support different date formats!</p>
+
+    <p>Here are some valid date formats:</p>
+    <ul>
+      <li>2009-07-12</li>
+      <li>2009/07/12</li>
+      <li>2009/07/12 12</li>
+      <li>2009/07/12 12:34</li>
+      <li>2009/07/12 12:34:56</li>
+    </ul>
+
+    <p>If you specify the <i>xValueParser</i> option, then all this detection is
+    bypassed and your function is called instead. Your parser function takes in
+    a string and needs to return a number. For dates/times, you should return
+    milliseconds since epoch. You may also want to specify a few other options
+    to make sure that everything gets displayed properly.<p>
+
+    <p>Here's code which parses a CSV file with unix timestamps in the first
+    column:</p>
+
+    <code>
+      new Dygraph(el,
+                  "Date,Series1,Series2\n" +
+                  "1247382000,100,200\n" +
+                  "1247986800,150,201\n",
+                  {
+                    xValueFormatter: Dygraph.dateString_,
+                    xValueParser: function(x) { return 1000*parseInt(x); },
+                    xTicker: Dygraph.dateTicker
+                  });
+    </code>
+
+    <h4>y-values</h4>
+    <p>Dependent (y-axis) values are simpler than x-values because they're
+    always numbers. The complexity here comes from the various ways that you can
+    specify the uncertainty in your measurements.<p>
+
+    <p>If your y-values are just numbers, then they need to be parseable by
+    JavaScript's parseFloat function. Acceptable formats include:</p>
+
+    <ul>
+      <li>12
+      <li>-12
+      <li>12.
+      <li>12.3
+      <li>1.24e+1
+      <li>-1.24e+1
+    </ul>
+
+    <p>If you have missing data, just leave the column blank (your CSV file will
+    probably contain a ",," in it).</p>
+
+    <p>If your numbers have uncertainty associated with them, then there are
+    three basic ways to express this: using fractions, standard deviations or
+    explicit ranges.</p>
+
+    <h5>Fractions</h5>
+    <p>If you specify the <i>fractions</i> option, then your data will all be
+    interpreted as ratios between zero and one. This is often the case if you're
+    plotting a percentage.</p>
+
+    <code>
+      new Dygraph(el,
+                  "X,Frac1,Frac2\n" +
+                  "1,1/2,3/4\n"+
+                  "2,1/3,2/3\n"+
+                  "3,2/3,17/49\n"+
+                  "4,25/30,100/200",
+                  { fractions: true });
+    </code>
+
+    <p>Why not just divide the fractions out yourself? There are two attractive
+    reasons not to:</p>
+
+    <ul>
+      <li>If you set both <i>fractions</i> and <i>errorBars</i>, then the
+    denominator is interpreted as a sample size and dygraphs will plot <a
+      href="http://en.wikipedia.org/wiki/Binomial_proportion_confidence_interval">Wilson
+      binomial proportion confidence intervals</a> around each point.
+
+      <li>If you set <i>showRoller</i>, then dygraphs will combine the values as
+      fractions. If two point are <i>a/b</i> and <i>c/d</i>, it will plot
+      <i>(a+b) / (c+d)</i> rather than <i>(a/b + c/d) / 2</i>, which is what
+      you'd get if you divided the fractions through. This will also shrink the
+      confidence intervals.</li>
+    </ul>
+
+    <h5>Standard Deviations</h5>
+    <p>Often you have a measurement and also a measure of its uncertainty: a
+    standard deviation. If you specify the <i>errorBars</i> option, dygraphs
+    will look for alternating value and standard deviation columns in your CSV
+    data.  Here's what it should look like:</p>
+
+    <code>
+      new Dygraph(el,
+                  "X,Y1,Y2\n" +
+                  "1,10,5,20,5\n" +
+                  "2,12,5,22,5\n",
+                  { errorBars: true });
+    </code>
+
+    <p>The "5" values are standard deviations. When each point is plotted, a
+    2-standard deviation region around it is shaded, resulting in a 95%
+    confidence interval. If you want more or less confidence, you can set the
+    <i>sigma</i> option to something other than 2.0.</p>
+
+    <p>When you roll data with standard deviations, dygraphs will plot the
+    average of your values in each rolling period and the RMS value of your
+    standard deviations: sqrt(std1 + std2 + std3 + ... + stdN)/N.</p>
+
+    <h5>Custom error bars</h5>
+    <p>Sometimes your data has asymetric uncertainty or you want to specify
+    something else with the error bars around a point. One example of this is
+    the "temperatures" demo on the <a href="http://danvk.org/dygraphs">dygraphs
+      home page.</a>, where the point is the daily average and the bars denote
+    the low and high temperatures for the day.</p>
+
+    <p>To specify this format, set the <i>customBars</i> option. Your CSV values
+    should each be three numbers separated by semicolons ("low;mid;high").
+    Here's an example:</p>
+
+    <code>
+      new Dygraph(el,
+                  "X,Y1,Y2\n" +
+                  "1,10;20;30,20;5;25\n" +
+                  "2,10;25;35,20;10;25\n",
+                  { customBars: true });
+    </code>
+
+    <p>The middle value need not lie between the low and high values. If you set
+    a rolling period, the three values will all be averaged independently.</p>
+
+
+    <h3>URL</h3>
+    <p>If you pass in a URL, dygraphs will issue an XMLHttpRequest for it and
+    attempt to parse the returned data as CSV.
+    </p>
+
+    <p><i>Common problems</i>. Make sure the URL is accessible and returns data
+    in text format (as opposed to a CSV file with an HTML header). You can see
+    what the response looks like by checking your JS console or by requesting
+    the URL yourself.</p>
+
+
+    <h3>Array (native format)</h3>
+    <p>If you'll be constructing your data set from a server-side program (or
+    from JavaScript) then you're better off producing an array than CSV data.
+    This saves the cost of parsing the CSV data and also avoids common parser
+    errors.</p>
+
+    <p>The downside is that it's harder to look at your data (you'll need to use
+    a JS debugger) and that the data format is a bit less clear for values with
+    uncertainties.</p>
+
+
+    Array
+    - disclaimers
+    - Dates on the x-axis
+    - how to specify fractions
+    - how to specify missing data
+    - how to specify value + std. dev.
+    - how to specify [low, middle, high]
+
+    Functions
+    - make sure they work as expected:
+        function() { return x; }
+      is identical as a source to "x".
+
+    DataTable
+    - Links to relevant gviz docs
+    - When to use Dygraph.GvizWrapper
+    - how to specify fractions
+    - how to specify missing data
+    - how to specify value + std. dev.
+    - how to specify [low, middle, high]
+    - walkthrough of embedding a gadget in google docs/on a web page
+    - walkthrough of using std. dev. in a spreadsheet chart
+
+  </body>
+</html>