Skip to content
This repository has been archived by the owner on Nov 6, 2023. It is now read-only.

Commit

Permalink
Fix several small bugs and introduce a local demo environment
Browse files Browse the repository at this point in the history
  • Loading branch information
DementevNikita committed Jul 3, 2022
1 parent c5e3fd2 commit 60dd6fa
Show file tree
Hide file tree
Showing 5 changed files with 259 additions and 482 deletions.
64 changes: 64 additions & 0 deletions docker/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
**# Open Data Discovery GCP Collector local demo environment
* * *

The following is a set of instructions to run ODD GCP Collector locally using docker and docker-compose.

This environment consists of:
* ODD Platform – an application that ingests, structurizes, indexes and provides a collected metadata via REST API and UI
* ODD Collector GCP – a lightweight service which gathers metadata from GCP

## Prerequisites

* Docker Engine 19.03.0+
* Preferably the latest docker-compose

## Step 1: Configuring and running ODD Platform

### Assumptions

* Port 8080 is free. Commands to check that might be:
* Linux/Mac: `lsof -i -P -n | grep LISTEN | grep 8080`
* Windows Powershell: `Get-NetTCPConnection | where Localport -eq 8080 | select Localport,OwningProcess`

### Execution

Run **from the project root folder** `docker-compose -f docker/demo.yaml up -d odd-platform`.

### Result

Open http://localhost:8080/ in your browser. You should be able to see an an empty catalog

## Step 2: Configuring and running GCP Collector to gather metadata

### Create Collector entity

1. Go to the http://localhost:8080/management/collectors and select `Add collector`
2. Complete the following fields:
* **Name**
* **Namespace** (optional)
* **Description** (optional)
3. Click **Save**. Your collector should appear in the list
4. Copy the token by clicking **Copy** right to the token value

### Configure and run the Collector

1. Paste the token obtained in the previous step into the `docker/config/collector_config.yaml` file under the `token` entry
2. Create a GCP API key using [this](https://cloud.google.com/docs/authentication/getting-started) documentation. Save this key to a file `docker/config/key.json`
3. Run **from the project root folder** `docker-compose -f docker/demo.yaml up -d odd-collector-gcp`.

### Result

1. Open http://localhost:8080/management/datasources in your browser

You should be able to see a new data source with the name `bigquery-storage`

2. Go to the **Catalog** section. Select the created data source in the `Datasources` filter

You should be able to see entities that GCP collector was able to gather from the GCP

### Troubleshooting

**My entities from the sample data aren't shown in the platform.**

Check the logs by running **from the project root folder** `docker-compose -f docker/demo.yaml logs -f`
**
6 changes: 6 additions & 0 deletions docker/config/collector_config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
default_pulling_interval: 1
token: "PVXrO6ENL0Va6lYKuybgt0SoHsTQd0LoLotbZqMi"
plugins:
- type: bigquery_storage
name: bigquery_storage
project: opendatadiscovery
32 changes: 32 additions & 0 deletions docker/demo.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
version: "3.8"

services:
odd-platform-database:
image: postgres:13.2-alpine
restart: always
environment:
- POSTGRES_USER=odd-platform
- POSTGRES_PASSWORD=odd-platform-password
- POSTGRES_DB=odd-platform

odd-platform:
image: ghcr.io/opendatadiscovery/odd-platform:0.5.3-arm
restart: always
environment:
- SPRING_DATASOURCE_URL=jdbc:postgresql://odd-platform-database:5432/odd-platform
- SPRING_DATASOURCE_USERNAME=odd-platform
- SPRING_DATASOURCE_PASSWORD=odd-platform-password
depends_on:
- odd-platform-database
ports:
- 8080:8080

odd-collector-gcp:
image: odd-collector-gcp:0.1.0
restart: always
volumes:
- ./config/collector_config.yaml:/app/collector_config.yaml
- ./config/key.json:/etc/key.json
environment:
- PLATFORM_HOST_URL=http://odd-platform:8080
- GOOGLE_APPLICATION_CREDENTIALS=/etc/key.json
8 changes: 3 additions & 5 deletions odd_collector_gcp/adapters/bigquery_storage/mapper.py
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,9 @@ def map_table(self, table: Table) -> DataEntity:
)

def map_schema(self, schema: SchemaField) -> List[DataSetField]:
if isinstance(schema, list):
return reduce(iconcat, [self.map_field(f) for f in schema], [])

return reduce(iconcat, [self.map_field(f) for f in schema.fields], [])

def map_field(self, field: SchemaField) -> List[DataSetField]:
Expand Down Expand Up @@ -111,9 +114,4 @@ def map_simple_field(
),
)

if field.type.type == Type.TYPE_STRING:
field_schema.stats = DataSetFieldStat(
string_stats=StringFieldStat(max_length=field_schema.max_length)
)

return field
Loading

0 comments on commit 60dd6fa

Please sign in to comment.