[DESIGN][m]: add roadmap section.

This commit is contained in:
Rufus Pollock 2021-04-12 19:04:59 +02:00
parent 2562781e3e
commit d825e982ec

View File

@ -1,5 +1,73 @@
# Design Notes
## Roadmap
General comment: let's do "README" (docs) driven development here.
* [x] [show] Local functionality for Frictionless datasets with CSV #528
* [x] Move in new work (portal-experiment) into portal.js and refactor https://github.com/datopian/portal.js.bak/issues/59
* [ ] [show] Uber Epic covering all functionality **See below**
* [ ] [show] README only + data datasets (dont have to be frictionless)
* (?) Graphs direct in README with say visdown …
* [ ] [show] SQL interface to the data (alasql or sql.js … https://github.com/agershun/alasql/wiki/Performance-Tests)
* [ ] file/resource subpages ... (for datasets with lots of resources)
* [ ] Docs **80% analysed** #
* [ ] Create portal components and library i.e. have a Table, Graph, Dataset component
* [ ] publish to @datopian/portal
* [ ] Examples
* [ ] Catalog functionality **20% analysed**
## [uber][epic] Show functionality for single datasets
### Features
* Elegant
* Description (README/Description)
* Data preview and exploration (for tablular)
* Basic: some sample data shown
* Data exploration v1: filterable
* Data Exploration v2: can do sql etc ...
* Graphs / visualization
* Validation: this row does not match schema in column X
* Summarization e.g. this columns has this range of values, this average value, this number of nulls
### Dataset structure support (in rough order of priority / like implementation)
* Frictionless
* Plain README (with frontmatter)
* README (no frontmatter) and LICENSE file (?)
Data has roughly two dimensions that are relevant
* Format
* CSV
* xlsx
* JSON
* ...
* Size
* Small: < 5mb (can just load inline ...)
* Medium < 100mb
* Large < 5Gb
* xlarge > 5Gb
* TODO: How does show/build work with remote files e.g. a resource ...
```
path: abc.csv
remote_storage_url: s3://.../.../.../
```
Options:
* We clone the data into path locally ...
* Possible problem if data is big ...
* Load data direct from remote_storage_url (as long as supports CORs)
## Architecture
Portal.js is a React and NextJS based framework for building dataset/resources pages and catalogs. It consists of:
* React components for data portal functionality e.g. data tables, graphs, dataset pages etc