datahub/examples/simple-example
Luccas Mateus 5305cc4c2f
Make examples easy to use (#798)
* [monorepo][m] - remove nx from simple-example

* [simple-example][sm] - install octokit and simplify README

* [simple-example][m] - fix linting

* [monorepo][m] - simplify examples

* [monorepo][sm] - update docs
2023-04-25 07:39:34 -03:00
..
2023-04-25 07:39:34 -03:00
2023-04-25 07:39:34 -03:00
2023-04-25 07:39:34 -03:00
2023-04-25 07:39:34 -03:00

This is a repo intended to serve as a simple example of a data catalog that get its data from a series of github repos, you can init an example just like this one by.

  • Creating a new project with create-next-app like so:
npx create-next-app <app-name> --example https://github.com/datopian/portaljs/tree/main/examples/simple-example
cd <app-name>
  • This project uses the github api, which for anonymous users will cap at 50 requests per hour, so you might want to get a Personal Access Token and add it to a .env file inside the folder like so
GITHUB_PAT=<github token>
  • Edit the file datasets.json to your liking, some examples can be found inside this repo
  • Run the app using:
npm run dev

Congratulations, you now have something similar to this running on http://localhost:3000 If yo go to any one of those pages by clicking on More info you will see something similar to this

Deployment

Deploy with Vercel

By clicking on this button, you will be redirected to a page which will allow you to clone the content into your own github/gitlab/bitbucket account and automatically deploy everything.

Structure of datasets.json

The datasets.json file is simply a list of datasets, below you can see a minimal example of a dataset

{
  "owner": "fivethirtyeight",
  "repo": "data",
  "branch": "master",
  "files": ["nba-raptor/historical_RAPTOR_by_player.csv", "nba-raptor/historical_RAPTOR_by_team.csv"],
  "readme": "nba-raptor/README.md"
}

It has

  • A owner which is going to be the github repo owner
  • A repo which is going to be the github repo name
  • A branch which is going to be the branch to which we need to get the files and the readme
  • A list of files which is going to be a list of paths with files that you want to show to the world
  • A readme which is going to be the path to your data description, it can also be a subpath eg: example/README.md

You can also add

  • A description which is useful if you have more than one dataset for each repo, if not provided we are just going to use the repo description
  • A Name which is useful if you want to give your dataset a nice name, if not provided we are going to use the junction of the owner the repo + the path of the README, in the exaple above it will be fivethirtyeight/data/nba-raptor

Extra commands

You can also build the project for production with

npm run build

And run using the production build like so:

npm run start