Add first parser
This commit is contained in:
parent
1d0d1b5cc2
commit
05182a65fe
49
Design.md
Normal file
49
Design.md
Normal file
@ -0,0 +1,49 @@
|
||||
# Email Parser Design
|
||||
|
||||
## Purpose
|
||||
A service that can read emails from an IMAP inbox and extract data and send it to other services
|
||||
|
||||
## Functionality
|
||||
* Extract tracking numbers and send to a package tracking service
|
||||
* Extract flight numbers and confirmations and send to trip tracking service
|
||||
* Extract dates and events and send to a calendar service
|
||||
|
||||
## Secondary services
|
||||
|
||||
### Package Tracking service
|
||||
* Receive tracking info via API and store in database
|
||||
* Web interface for viewing current status of all packages
|
||||
* Filters by date and status
|
||||
* ical subscription for arrival dates
|
||||
|
||||
### Flight Tracking Service
|
||||
* Receive tracking info via API and store in database
|
||||
* Web interface for viewing current status of all flights
|
||||
* Filters by date and status
|
||||
* ical subscription for flight times
|
||||
|
||||
## Architecture 1: Micro-services
|
||||
A single service to read email content and send the email content to a list of parser services.
|
||||
|
||||
A parser service would conform to an interface with an API that accepts an email with several attributes: sender, recipients, subject, body, datetime. It would then extract some attribute and send it to a tracking service. The attribute would be something like the tracking number for a flight or package and a calender time for an event.
|
||||
|
||||
A tracking service would accept this info and store in it's database. It would then provide a front end to this data via a website and an ical calendar URL. It may also be possible to abstract further the schemas and interface such that all trackers share a common infrastructure but make unique requests to metadata services.
|
||||
|
||||
## Architecture 2: Micro-services
|
||||
|
||||
Scanner service to scan emails and send to indexer. Indexer receives the email contents and makes requests to the parser services. Parser services respond with extracted text and the indexer will insert them into the database. The indexer also exposes a restful api on top of the data model.
|
||||
|
||||
Viewing services would use the restful API to display content and expose additional metadata.
|
||||
|
||||
## Architecture 3: Message based queue
|
||||
|
||||
Scanner scans emails and inserts task into a queuing service (RabbitMQ with a fanout). Multiple parsers read from these queues and attach extracted data and make requests to the indexing service for storage. Front end services sit on top of a restful api on the database.
|
||||
|
||||
## Useful packages
|
||||
Golang package for extracting numbers and carriers from unstructured text: https://github.com/lensrentals/trackr
|
||||
|
||||
Python package for retrieving status from a tracking number: https://github.com/alertedsnake/packagetracker
|
||||
|
||||
Ruby gem for extracting shipping info from a number or unstructured text: https://github.com/jkeen/tracking_number
|
||||
|
||||
Ruby gem for retrieving tracking info based on an ID: https://github.com/travishaynes/trackerific
|
6
docker-compose.yml
Normal file
6
docker-compose.yml
Normal file
@ -0,0 +1,6 @@
|
||||
version: '2'
|
||||
services:
|
||||
parser_package_tracking:
|
||||
build: ./parsers/package-tracking
|
||||
ports:
|
||||
- "8183:3000"
|
22
parsers/Readme.md
Normal file
22
parsers/Readme.md
Normal file
@ -0,0 +1,22 @@
|
||||
# parsers
|
||||
|
||||
A parser should conform to a simple API spec so that it can be easily accessed
|
||||
|
||||
# Healthcheck
|
||||
Simple endpoint that accepts nothing and returns 'OK' on success.
|
||||
|Path |`/`|
|
||||
|Method |`GET`|
|
||||
|Response|`"OK"`|
|
||||
|
||||
# Parse
|
||||
The primary endpoint that will parse a message
|
||||
|Path |`/parse`|
|
||||
|Method |`POST`|
|
||||
|Response|`json`|
|
||||
|
||||
Response
|
||||
|Key |Example Value |Description|
|
||||
|--------|----------------------|-----------|
|
||||
|token |`"1Z879E930346834440"`|String token that was extracted|
|
||||
|type |`"SHIPPING"` |A string that indicates what type of metadata that was extracted. This will be used by other services to understand what kind of data this is.|
|
||||
|metadata|`{"carrier": "UPS"}` |A dictionary with any other additional metadat that may be used by other services|
|
14
parsers/package-tracking/Dockerfile
Normal file
14
parsers/package-tracking/Dockerfile
Normal file
@ -0,0 +1,14 @@
|
||||
FROM ruby:2.5.0
|
||||
|
||||
# TODO: Move to Gemfile
|
||||
RUN gem install sinatra -v 2.0
|
||||
RUN gem install tracking_number -v 0.10.3
|
||||
|
||||
EXPOSE 3000
|
||||
|
||||
RUN mkdir -p /src
|
||||
WORKDIR /src
|
||||
|
||||
COPY main.rb /src/
|
||||
|
||||
CMD ruby main.rb
|
6
parsers/package-tracking/docker-compose.yml
Normal file
6
parsers/package-tracking/docker-compose.yml
Normal file
@ -0,0 +1,6 @@
|
||||
version: '2'
|
||||
services:
|
||||
main:
|
||||
build: .
|
||||
ports:
|
||||
- "127.0.0.1:8183:3000"
|
26
parsers/package-tracking/main.rb
Normal file
26
parsers/package-tracking/main.rb
Normal file
@ -0,0 +1,26 @@
|
||||
require 'sinatra'
|
||||
require 'tracking_number'
|
||||
|
||||
set :bind, "0.0.0.0"
|
||||
set :port, 3000
|
||||
|
||||
# Simple status endpoint on root
|
||||
get '/' do
|
||||
'OK'
|
||||
end
|
||||
|
||||
# Standard parser api receives PUT {"message": "Email body"} /parse
|
||||
# Returns [{"token": "extracted token", "type": "token type", "metadata": {}]
|
||||
post '/parse' do
|
||||
body = JSON.parse(request.body.read)
|
||||
trackers = TrackingNumber.search(body["message"])
|
||||
results = []
|
||||
for tracker in trackers do
|
||||
results.push({
|
||||
:token => tracker.tracking_number,
|
||||
:type => "SHIPPING",
|
||||
:metadata => {}
|
||||
})
|
||||
end
|
||||
JSON.dump(results)
|
||||
end
|
Loading…
x
Reference in New Issue
Block a user