You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Use SQLa to create the Data Vault infrastructure as I’ve shown below – we will build views in SQL to get access to the email JSON data
Create nodeJS or other EE SDK-based app which reads all emails via IMAP and stores the content in the DataVault
Setup webhooks in EE so that incoming emails push to the Data Vault.
Setting up the Data Vault tables:
Hub: This will store unique email identifiers.
Satellite: This will store descriptive data related to each email.
CREATE TABLE hub_email (
email_hash BYTEA PRIMARY KEY, -- A hashed value to uniquely identify each email
load_datetime TIMESTAMP NOT NULL, -- Timestamp when the record was loaded
record_source TEXT NOT NULL -- Source of the record
);
CREATE TABLE sat_email_message(
email_hash BYTEA NOT NULL, -- The hash to link back to the hub
load_datetime TIMESTAMP NOT NULL, -- Timestamp when the record was loaded
end_datetime TIMESTAMP, -- Timestamp when the record is superseded by a new version. Null for the current version
email_data JSONB NOT NULL, -- Email data in JSONB format (EE generates a JSON of the email)
record_source TEXT NOT NULL, -- Source of the record
PRIMARY KEY (email_hash, load_datetime),
FOREIGN KEY (email_hash) REFERENCES hub_email(email_hash)
);
CREATE TABLE sat_email_attachment(…);
-- Index for the hub
CREATE INDEX idx_hub_email_hash ON hub_email(email_hash);
-- Indexes for the satellite
CREATE INDEX idx_sat_email_detail_hash ON sat_email_detail(email_hash);
CREATE INDEX idx_sat_email_detail_load_datetime ON sat_email_detail(load_datetime);
CREATE INDEX idx_sat_email_detail_data_gin ON sat_email_detail USING GIN(email_data);
Create as many views as we need for accessing the emails in properly structured format.
Talk with Raphael and team for how they do FHIR mapping for complex JSON from unstructured text.
Example for how to insert data – this should be a stored procedure
Assuming we have variables:
$1 as the email JSONB, $2 as the record source, and $3 as the current timestamp
-- First, compute the email_hash
DO $$DECLARE email_hash_val BYTEA;
BEGIN
email_hash_val := digest($1::TEXT, 'sha256');
-- Try to insert into hub
INSERT INTO hub_email(email_hash, load_datetime, record_source)
VALUES (email_hash_val, $3, $2)
ON CONFLICT (email_hash) DO NOTHING; -- prevent duplicates
-- Insert into satellite
INSERT INTO sat_email_detail(email_hash, load_datetime, email_data, record_source)
VALUES (email_hash_val, $3, $1, $2);
END$$;
The text was updated successfully, but these errors were encountered:
Here we are using the table_name_id as primary key and other required columns with the housekeeping as well. As per the reference given above, we need to create hub table with a different type of primary key with a different name (email_hash BYTEA PRIMARY KEY). Can you confirm?
Now, we need to do the following:
Setting up the Data Vault tables:
-- Index for the hub
-- Indexes for the satellite
Create as many views as we need for accessing the emails in properly structured format.
Talk with Raphael and team for how they do FHIR mapping for complex JSON from unstructured text.
Example for how to insert data – this should be a stored procedure
Assuming we have variables:
-- First, compute the email_hash
The text was updated successfully, but these errors were encountered: