Skip to content
This repository has been archived by the owner on Jul 18, 2024. It is now read-only.

[DataCap Application] Human PanGenomics Project Public Dataset [2/3] #1618

Closed
1 of 2 tasks
Gitel-chu opened this issue Feb 16, 2023 · 49 comments
Closed
1 of 2 tasks

[DataCap Application] Human PanGenomics Project Public Dataset [2/3] #1618

Gitel-chu opened this issue Feb 16, 2023 · 49 comments

Comments

@Gitel-chu
Copy link

Data Owner Name

Human Pangenome Reference Consortium

Data Owner Country/Region

United States

Data Owner Industry

Life Science / Healthcare

Website

https://humanpangenome.org/

Social Media

https://twitter.com/HumanPangenome
https://github.com/human-pangenomics/hpgp-data

Total amount of DataCap being requested

5PiB

Weekly allocation of DataCap requested

500TiB

On-chain address for first allocation

f1edhwfgifichqhz74dri2uh2l2exl6jtpeuuersi

Custom multisig

  • Use Custom Multisig

Identifier

No response

Share a brief history of your project and organization

We want to store these to preserve data that is meaningful to humans. The Human Pangenome Reference Consortium (HPRC) is a project funded by the National Human Genome Research Instititue to sequence and assemble genomes from individuals from diverse populations in order to better represent genomic landscape of diverse human populations.

Is this project associated with other projects/ecosystem stakeholders?

No

If answered yes, what are the other projects/ecosystem stakeholders

No response

Describe the data being stored onto Filecoin

This dataset includes sequencing data, assemblies, and analyses for the offspring of ten parent-offspring trios.

Where was the data currently stored in this dataset sourced from

AWS Cloud

If you answered "Other" in the previous question, enter the details here

No response

How do you plan to prepare the dataset

lotus, I don't know yet

If you answered "other/custom tool" in the previous question, enter the details here

No response

Please share a sample of the data

https://registry.opendata.aws/hpgp-data/

Confirm that this is a public dataset that can be retrieved by anyone on the Network

  • I confirm

If you chose not to confirm, what was the reason

No response

What is the expected retrieval frequency for this data

Yearly

For how long do you plan to keep this dataset stored on Filecoin

1.5 to 2 years

In which geographies do you plan on making storage deals

Asia other than Greater China, North America, South America, Europe

How will you be distributing your data to storage providers

HTTP or FTP server, Shipping hard drives, I don't know yet

How do you plan to choose storage providers

Slack, Big data exchange

If you answered "Others" in the previous question, what is the tool or platform you plan to use

No response

If you already have a list of storage providers to work with, fill out their names and provider IDs below

No response

How do you plan to make deals to your storage providers

Boost client, Lotus client

If you answered "Others/custom tool" in the previous question, enter the details here

No response

Can you confirm that you will follow the Fil+ guideline

Yes

@large-datacap-requests
Copy link

Thanks for your request!

Heads up, you’re requesting more than the typical weekly onboarding rate of DataCap!

@large-datacap-requests
Copy link

Thanks for your request!
Everything looks good. 👌

A Governance Team member will review the information provided and contact you back pretty soon.

@lvschouwen
Copy link

Simular sets:
#1233 #1234 #1235
#1543 #1544 #1545 #1546

lucas@toolbox:~$ aws s3 ls --no-sign-request s3://human-pangenomics/ --recursive --human-readable --summarize
Total Objects: 195841
   Total Size: 1.5 PiB

@cryptowhizzard
Copy link

Thank you for applying for datacap. As Filecoin FIL+ notary i am screening your application and conducting due diligence.

Looking at your application i have some questions:
As you are brand new on Github and have no history of past applications it seems to me that applying for 5PB of datacap is a lot. One needs comprehensive knowledge of Filecoin, packing of data, distribution of data and all it's requirements coming with it. Are you brand new in the Filecoin space or have you applied for datacap in the past on different Github account names?

Can you show us visible proof of the size of your data and the storage systems you have there?

As last question i would like you to fill out this form to provide us with the necessary information to make a educated decision on your LDN request if we would like to support it.

Thanks!

@Sunnyiscoming Sunnyiscoming self-assigned this Feb 20, 2023
@Sunnyiscoming
Copy link
Collaborator

Datacap Request Trigger

Total DataCap requested

5PiB

Expected weekly DataCap usage rate

500TiB

Client address

f1edhwfgifichqhz74dri2uh2l2exl6jtpeuuersi

@large-datacap-requests
Copy link

large-datacap-requests bot commented Feb 20, 2023

DataCap Allocation requested

Multisig Notary address

f02049625

Client address

f1edhwfgifichqhz74dri2uh2l2exl6jtpeuuersi

DataCap allocation requested

250TiB

Id

2361116f-c8ab-426f-8751-20c5e2f937ad

@herrehesse
Copy link

Dear Filecoin+ Github applicant,

We have noticed that the dataset is already (partly) on chain. While we appreciate your enthusiasm to contribute to the Filecoin network, we want to remind you that this behaviour may not be beneficial to the network. Can you explain to me what happend here?

Thank you for your understanding and cooperation.

Screenshot 2023-02-22 at 11 27 40

@Gitel-chu
Copy link
Author

It is just because the dataset we choose is very significant. I can only consider that this is a coincidence.

Copy link

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzaceac7ljn6l76ydzdbswupko23lpbixb36mdozviv7ztbhbqjo2fbig

Address

f1edhwfgifichqhz74dri2uh2l2exl6jtpeuuersi

Datacap Allocated

250.00TiB

Signer Address

f1d4yb3wags3mtddzesxoo63jv7dmlec3bq4yteni

Id

2361116f-c8ab-426f-8751-20c5e2f937ad

You can check the status of the message here: https://filfox.info/en/message/bafy2bzaceac7ljn6l76ydzdbswupko23lpbixb36mdozviv7ztbhbqjo2fbig

Copy link

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacea4l7fs3ck4v4au2l7ebj7pzhsmvkxoytan5junaw2xsm4gqtq4au

Address

f1edhwfgifichqhz74dri2uh2l2exl6jtpeuuersi

Datacap Allocated

250.00TiB

Signer Address

f1bwugfihrmn3iyunzyxst5nttql3dge4khwmurtq

Id

2361116f-c8ab-426f-8751-20c5e2f937ad

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacea4l7fs3ck4v4au2l7ebj7pzhsmvkxoytan5junaw2xsm4gqtq4au

Copy link

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacecavsa2l3lxxph6ba3nnvnvoej3ftmjgcpdvwjtr5m2ltph4se7gy

Address

f1edhwfgifichqhz74dri2uh2l2exl6jtpeuuersi

Datacap Allocated

1.95PiB

Signer Address

f174fg3bqbln3zjnkxtyf6s54txqkr7yqkj6cig7y

Id

80a52e6c-0f8d-43ac-8278-2dc66ba7c591

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacecavsa2l3lxxph6ba3nnvnvoej3ftmjgcpdvwjtr5m2ltph4se7gy

@hengdingy
Copy link

checker:manualTrigger

@filplus-checker-app
Copy link

DataCap and CID Checker Report Summary1

Retrieval Statistics

⚠️ All retrieval success ratios are below 1%.

  • Overall Graphsync retrieval success rate: 0.00%
  • Overall HTTP retrieval success rate: 0.00%
  • Overall Bitswap retrieval success rate: 0.00%

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

✔️ Data replication looks healthy.

Deal Data Shared with other Clients2

✔️ No CID sharing has been observed.

Full report

Click here to view the CID Checker report.
Click here to view the Retrieval report.

Footnotes

  1. To manually trigger this report, add a comment with text checker:manualTrigger

  2. To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

@github-actions
Copy link

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

@github-actions github-actions bot added the Stale label Jul 27, 2023
@Gitel-chu
Copy link
Author

yes

@cryptowhizzard
Copy link

#1616

Retrieval problems.

@github-actions github-actions bot removed the Stale label Jul 28, 2023
@Gitel-chu
Copy link
Author

https://retrievalbot-dashboard.vercel.app/?clients=f02038606
It is our retrieval report.
image

I've contacted and asked sps to support http retrieval. It is all obey the rules of community. I don't know any about their cid sharing. But in my applications, there's no problem like that.

@cryptowhizzard
Copy link

Thanks.

Please let me know when your SP notified you when I should be able to download / check your data

@github-actions
Copy link

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

@github-actions github-actions bot added the Stale label Aug 12, 2023
@Gitel-chu
Copy link
Author

https://retrievalbot-dashboard.vercel.app/?clients=f02038606 It is our retrieval report. image

I've contacted and asked sps to support http retrieval. It is all obey the rules of community. I don't know any about their cid sharing. But in my applications, there's no problem like that.

Hi, they have already supported retrieval before you asked this question.

@github-actions github-actions bot removed the Stale label Aug 15, 2023
@Gitel-chu
Copy link
Author

We are ready to continue storing.

@github-actions
Copy link

github-actions bot commented Sep 5, 2023

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

--
Commented by Stale Bot.

@github-actions
Copy link

This application has not seen any responses in the last 14 days, so for now it is being closed. Please feel free to contact the Fil+ Gov team to re-open the application if it is still being processed. Thank you!

--
Commented by Stale Bot.

@Sunnyiscoming
Copy link
Collaborator

Hello, @Gitel-chu per the filecoin-project/notary-governance#922 for Open, Public Dataset applicants, please complete the following Fil+ registration form to identify yourself as the applicant and also please add the contact information of the SP entities you are working with to store copies of the data.

This information will be reviewed by Fil+ Governance team to confirm validity and then the application will be allowed to move forward for additional notary review.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests