Time series data¶
Pecos uses Pandas DataFrames to store and analyze data indexed by time. Pandas DataFrames store 2D data with labeled columns. Pandas includes a wide range of time series analysis and date-time functionality. By using Pandas DataFrames, Pecos is able to take advantage of a wide range of timestamp strings, including UTC offset.
Pandas includes many built-in functions to read data from CSV, Excel, SQL, etc. For example, data can be loaded from an excel file using the following code.
>>> import pandas as pd >>> data = pd.read_excel('data.xlsx')
Data can also be gathered from the web using the Python package request, http://docs.python-requests.org.
To get started, create an instance of the
Quality control tests can also be called using individual functions, see Framework for more details.
>>> import pecos >>> pm = pecos.monitoring.PerformanceMonitoring()
Data, in the form of a Pandas Dataframe, can then be added to the PerformanceMonitoring object.
The data is accessed using
Multiple DataFrames can be added to the PerformanceMonitoring object. New data overrides existing data if DataFrames share indexes and columns. Missing indexes and columns are filled with NaN. An example is shown below.
>>> print(data1) A B 2018-01-01 0 5 2018-01-02 1 6 2018-01-03 2 7 >>> print(data2) B C 2018-01-02 0 5 2018-01-03 1 6 2018-01-04 2 7 >>> pm.add_dataframe(data1) >>> pm.add_dataframe(data2) >>> print(pm.df) A B C 2018-01-01 0.0 5.0 NaN 2018-01-02 1.0 0.0 5.0 2018-01-03 2.0 1.0 6.0 2018-01-04 NaN 2.0 7.0