diff --git a/DESIGN.md b/DESIGN.md index 89c4c568..007513de 100644 --- a/DESIGN.md +++ b/DESIGN.md @@ -1,5 +1,73 @@ # Design Notes +## Roadmap + +General comment: let's do "README" (docs) driven development here. + +* [x] [show] Local functionality for Frictionless datasets with CSV #528 + * [x] Move in new work (portal-experiment) into portal.js and refactor https://github.com/datopian/portal.js.bak/issues/59 +* [ ] [show] Uber Epic covering all functionality **See below** + * [ ] [show] README only + data datasets (don’t have to be frictionless) + * (?) Graphs direct in README with say visdown … + * [ ] [show] SQL interface to the data (alasql or sql.js … https://github.com/agershun/alasql/wiki/Performance-Tests) + * [ ] file/resource subpages ... (for datasets with lots of resources) +* [ ] Docs **80% analysed** # +* [ ] Create portal components and library i.e. have a Table, Graph, Dataset component + * [ ] publish to @datopian/portal + * [ ] Examples +* [ ] Catalog functionality **20% analysed** + +## [uber][epic] Show functionality for single datasets + +### Features + +* Elegant +* Description (README/Description) +* Data preview and exploration (for tablular) + * Basic: some sample data shown + * Data exploration v1: filterable + * Data Exploration v2: can do sql etc ... +* Graphs / visualization +* Validation: this row does not match schema in column X +* Summarization e.g. this columns has this range of values, this average value, this number of nulls + +### Dataset structure support (in rough order of priority / like implementation) + +* Frictionless +* Plain README (with frontmatter) +* README (no frontmatter) and LICENSE file (?) + +Data has roughly two dimensions that are relevant + +* Format + * CSV + * xlsx + * JSON + * ... +* Size + * Small: < 5mb (can just load inline ...) + * Medium < 100mb + * Large < 5Gb + * xlarge > 5Gb + +* TODO: How does show/build work with remote files e.g. a resource ... + + ``` + path: abc.csv + remote_storage_url: s3://.../.../.../ + ``` + + Options: + + * We clone the data into path locally ... + * Possible problem if data is big ... + * Load data direct from remote_storage_url (as long as supports CORs) + + + + +## Architecture + Portal.js is a React and NextJS based framework for building dataset/resources pages and catalogs. It consists of: * React components for data portal functionality e.g. data tables, graphs, dataset pages etc