Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Community Diligence Review of ND Cloud (ND Labs) Allocator #13

Closed
filecoin-watchdog opened this issue May 10, 2024 · 17 comments
Closed
Assignees
Labels
Diligence Audit in Process Governance team is reviewing the DataCap distributions and verifying the deals were within standards

Comments

@filecoin-watchdog
Copy link
Collaborator

Review of Top Value Allocations from @NDLABS-Leo
Allocator Application: filecoin-project/notary-governance#1026

First example:
DataCap was given to:
NDLABS-Leo/Allocator-Pathway-ND-CLOUD#11

1st Point:
with a quick search this client has a history of questionable node usage: filecoin-project/filecoin-plus-large-datasets#2077 (comment) - should be a flag

Public Open Dataset - key compliance requirement: Retrievability

2nd point)
Allocation schedule per allocator:
First: The client will provide their weekly application volume, and for the initial allocation, we will allocate 50% of the weekly application volume.
Second: After review, the client will be allocated 100% of the weekly application volume.
Third: After review, the client will be allocated 200% of the weekly application volume.
Fourth: After review, the client will be allocated 200% of the weekly application volume.
Max per client overall: Upon successful review, the client will be allocated the weekly application volume minus the already allocated quota, with a maximum single application of 5P.

Actual allocation: 500Tib, 1PiB - this follows the guidelines

3rd point)
No sign of KYC or KYB of client or dataset as mentioned in allocator application. Questions were asked from allocator about previous applications in LDN and SP nodes used

4th point)
Client said these were the SPs,
f02211572 | Chengdu, Sichuan | MicroAnt
f02814600 | Chengdu, Sichuan| BigMax
f02226869 | Nanchang, Jiangxi | LuckyMine
f02274508 | Hong Kong | H&W
f02329119 | Hangzhou, Zhejiang | Cryptomage
f02837293 | Seoul, Seoul | FiveByte
f01159754 | Singapore, Singapore | VITACapital
f01852363 | Singapore, Singapore | HectorLi
f02321504 | Los Angeles, California | ipollo
f02320312 | Los Angeles, California | R1
f02327534 | Los Angeles, California | ipollo
f02322031 | Los Angeles, California | ipollo
f02320270 | Los Angeles, California | R1
f01853077 | Singapore, Singapore | Alpha100

Actual data storage report:
https://check.allocator.tech/report/NDLABS-Leo/Allocator-Pathway-ND-CLOUD/issues/11/1715238018641.md

Provider | Location | Total Deals Sealed | Percentage | Unique Data | Duplicate Deals
f03046248 | Hong Kong, Hong Kong, HKChina Unicom Global | 319.44 TiB | 47.27% | 319.44 TiB | 0.00%
f02023435 | Hong Kong, Hong Kong, HKHK Broadband Network Ltd. | 119.44 TiB | 17.67% | 94.19 TiB | 21.14%
f02894855 | UnknownUnknown | 119.28 TiB | 17.65% | 119.28 TiB | 0.00%
f02956383 | Hong Kong, Hong Kong, HKANYUN INTERNET TECHNOLOGY (HK) CO.,LIMITED | 78.03 TiB | 11.55% | 78.03 TiB | 0.00%
f02948413 | Chengdu, Sichuan, CNChina Mobile Communications Group Co., Ltd. | 39.59 TiB | 5.86% | 39.59 TiB | 0.00%

None of SP IDs taking deals matches per report. Additional diligence needed to confirm entity and actual storage locations

5th point)
Second allocation awarded to this client.
However, per Spark dashboard, all SPs are either not available or have 0% retrievability.

The Allocator showed no sign of diligence after 1st allocation and gave the 2nd allocation of 1PiB to the client.

@filecoin-watchdog
Copy link
Collaborator Author

Second example:
DataCap was given to:
NDLABS-Leo/Allocator-Pathway-ND-CLOUD#13

Public Open Dataset - key compliance requirement: Retrievability

1st point)
Allocation schedule per allocator:
First: The client will provide their weekly application volume, and for the initial allocation, we will allocate 50% of the weekly application volume.
Second: After review, the client will be allocated 100% of the weekly application volume.
Third: After review, the client will be allocated 200% of the weekly application volume.
Fourth: After review, the client will be allocated 200% of the weekly application volume.
Max per client overall: Upon successful review, the client will be allocated the weekly application volume minus the already allocated quota, with a maximum single application of 5P.

Actual allocation: 50Tib, 1PiB - this follows the guidelines.

2nd point)
Allocator mentioned only giving 50TiBs to start and test NDLABS-Leo/Allocator-Pathway-ND-CLOUD#13 (comment)

3rd point)
Client said these were the SPs, no entities mentioned, no question from Allocator
f02023435 - HK
f02948413 - sichuang
f02836485 - Singaporean
f02320270 - usa
f02864389 - china

Actual data storage report:
https://check.allocator.tech/report/NDLABS-Leo/Allocator-Pathway-ND-CLOUD/issues/13/1715346764664.md

Provider | Location | Total Deals Sealed | Percentage | Unique Data | Duplicate Deals
f02956383 | Hong Kong, Hong Kong, HKANYUN INTERNET TECHNOLOGY (HK) CO.,LIMITED | 193.72 TiB | 20.87% | 193.72 TiB | 0.00%
f03066382 | Chengdu, Sichuan, CNChina Mobile Communications Group Co., Ltd. | 291.31 TiB | 31.39% | 291.16 TiB | 0.05%
f02948413 | Chengdu, Sichuan, CNChina Mobile Communications Group Co., Ltd. | 15.84 TiB | 1.71% | 15.84 TiB | 0.00%
f03046248 | Hong Kong, Hong Kong, HKChina Unicom Global | 197.66 TiB | 21.30% | 197.59 TiB | 0.03%
f02023435 | Hong Kong, Hong Kong, HKHK Broadband Network Ltd. | 31.06 TiB | 3.35% | 31.06 TiB | 0.00%
f03064819 | Seoul, Seoul, KRKorea Telecom | 198.47 TiB | 21.39% | 198.16 TiB | 0.16%

2 SPs match from the original list account for 4% of all deals. Also, additional diligence needs to be completed to confirm entity and location.

4th point)
Second allocation awarded to this client.
However, per Spark dashboard, all SPs are either not available or have 0% retrievability.

The Allocator showed no sign of diligence after 1st allocation and gave the 2nd allocation of 1PiB to the client.

@filecoin-watchdog
Copy link
Collaborator Author

example 3: DataCap given to: NDLABS-Leo/Allocator-Pathway-ND-CLOUD#15

Public Open Dataset - key compliance requirement: Retrievability

SPs provided:
f03046248 / Pinapple / HK
f02956383 / Future / HK
f02948413 / FineTune / sichuan
f02226869 / LuckyMine / LA
f02837293 / FiveByte / Seoul

Actual report data:
https://check.allocator.tech/report/NDLABS-Leo/Allocator-Pathway-ND-CLOUD/issues/15/1715346713876.md

Provider | Location | Total Deals Sealed | Percentage | Unique Data | Duplicate Deals
f02956383 | Hong Kong, Hong Kong, HKANYUN INTERNET TECHNOLOGY (HK) CO.,LIMITED | 288.19 TiB | 30.83% | 288.19 TiB | 0.00%
f03066382 | Chengdu, Sichuan, CNChina Mobile Communications Group Co., Ltd. | 197.06 TiB | 21.08% | 196.59 TiB | 0.24%
f03046248 | Hong Kong, Hong Kong, HKChina Unicom Global | 149.41 TiB | 15.99% | 149.41 TiB | 0.00%
f03064819 | Seoul, Seoul, KRKorea Telecom | 299.97 TiB | 32.10% | 299.97 TiB | 0.00%

2 SPs match. 0% retrievable, similar to all other applications, SPs

@Kevin-FF-USA Kevin-FF-USA added the Diligence Audit in Process Governance team is reviewing the DataCap distributions and verifying the deals were within standards label May 13, 2024
@Kevin-FF-USA Kevin-FF-USA self-assigned this May 13, 2024
@NDLABS-Leo
Copy link

Hi, @filecoin-watchdog

what are the key questions please?
If it's retrievability, when Allocator was first enabled, there was no retrievability rate displayed in the bot, it was just recently updated in. During the review process if the bot is fine according to my application rules the client can be assigned a new round.

@filecoin-watchdog
Copy link
Collaborator Author

filecoin-watchdog commented May 27, 2024

@NDLABS-Leo - just pointing out what I see from the data for Governance Team to use toward a review. Yes, retrievals is the main problem - why did clients continue getting datacap when their SPs are not following guidelines?

@filecoin-watchdog
Copy link
Collaborator Author

filecoin-watchdog commented May 27, 2024

image

@filecoin-watchdog
Copy link
Collaborator Author

filecoin-watchdog commented May 27, 2024

image

@filecoin-watchdog
Copy link
Collaborator Author

filecoin-watchdog commented May 27, 2024

image

@Kevin-FF-USA
Copy link
Collaborator

Hi @NDLABS-Leo

On the next Fil+ Allocator meeting we will be going over each refill application. Wanted to ensure you were tracking the review discussion taking place in #13.

If your schedule allows, recommend coming to the May 28th meeting to answer/discuss the issues raised in the recent distributions. This will allow you to faster address - or, the issue in Allocator Governance for ongoing written discussion.

Warmly,
-Kevin
https://calendar.google.com/calendar/embed?src=c_k1gkfoom17g0j8c6bam6uf43j0%40group.calendar.google.com&ctz=America%2FLos_Angeles

@NDLABS-Leo
Copy link

Hi, @Kevin-FF-USA
Thanks for the notice. I'll be on time tonight.

@NDLABS-Leo
Copy link

NDLABS-Leo commented May 28, 2024

Hi, @filecoin-watchdog
Thank you for asking questions based on facts. No offence, I would like to point out that as Allocator admins, we are responsible for our review, and if we continue to issue credits when a client's sp node does not comply with the rules, then there is something wrong with our review. However, our signature operation is a decision based on the bot situation at the time of the review, and we will not delegate credits if we find any non-compliance issues.

Also, here are three points that I would like to clarify:
First point, the retrieval rate of the nodes you screenshot out are all recent bot's retrieval function added before the data, you can recall that at the time of my review, because of the time is relatively early, there is no retrieval rate display, I can only through the chain of CID for sample retrieval test to carry out. And, you can see that bot are good before I did the signature.

The second point about the retrieval rate is that the retrieval test we conducted and the retrieval rate given in the current bot are not consistent. For example, the previous retrieval supported three kinds of retrieval (HTTP/GRAPHSYNC/BITSWAP), but the current retrieval is Mean Spark Retrieval. and as you can see in Spark's website, only 824 nodes in the whole network are included in the retrieval at the moment, and the rest of the nodes can't be monitored. This is what I would like to bring up regarding the inconsistency between our manual retrieval and the current bot retrieval data.

The third point, about the second point, we have also contacted the customer in troubleshooting the problem, I believe there will be a conclusion soon, by then I will make this conclusion public over here, and I also hope that RKH can record this situation, so as to prevent some sp from being misunderstood because of technical problems.

@willscott
Copy link
Collaborator

willscott commented May 28, 2024 via email

@NDLABS-Leo
Copy link

Hi , @willscott
By the way, each Allocator has its own vetting criteria. bot is indeed not exactly a vetting criterion, but it is something that we determine at the time of application and will be used as a vetting and tracking tool for our clients.

image

@NDLABS-Leo
Copy link

@Kevin-FF-USA @galen-mcandrew @filecoin-watchdog
We attended the notary meeting last night and we spoke at that meeting to explain the situation. The consensus at this point is that data retrieval needs to be done through Spark, and we will also be communicating this to our clients, as well as sp's that we can reach out to.

@galen-mcandrew
Copy link
Collaborator

Based on a further diligence review, this allocator pathway is partially in compliance with their application.

Specifically:

  • Mixed evidence of diligence with clients (no verification of client claims)
  • Subsequent allocations given despite noncompliant client deal-making, with minimal allocator intervention through comments
  • No retrievability for datasets, despite claims of public open data by both allocator and client (not showing distributed network data storage utility)

Given this mixed review, we are requesting that the allocator verify that they will uphold all aspects & requirements of their initial application. If so, we will request an additional 2.5PiB of DataCap from RKH, to allow this allocator to show increased diligence and alignment.

@NDLABS-Leo can you verify that you will enforce program and allocator requirements? (for example: public diligence, tranche schedules, and public scale retrievability like Spark). Please reply here with acknowledgement and any additional details for our review.

@NDLABS-Leo
Copy link

@galen-mcandrew
#13 (comment)
As mentioned above, the main problem with our current review is the retrievability of nodes.

We have also communicated with the spark team at slack several times about this issue.
https://filecoinproject.slack.com/archives/C06MTBZ44P2/p1717139259611159
https://filecoinproject.slack.com/archives/C03S6LXSRB8/p1717055709286189

We are also advancing our technology to interface with the spark technical team and are actively trying to test with spark's.
And, we've got initial progress so far, and the current update is that spark will count invalid cid as a retrieval rate statistic as well, and give a 0% retrieval rate result. We are currently testing this situation further, and if we come to a conclusion I will sync this to the community to let as many sp's know as possible!

So I am not happy with the result of assigning 2.5P to our lanes in the second round, and hopefully RKH can reassess in light of this latest situation. Since sprak has a large reach, probably most of the Allocator review situations will be affected, and we are pushing through our own efforts to deal with this issue. And, once the sprak issue is dealt with, the retrieval rate data will be changed to the correct data.

@NDLABS-Leo
Copy link

@galen-mcandrew
Hello Galen.
On the issue of node retrieval, 2 nodes have successfully obtained retrieval rates with our recent continuous progress.
image
image
Also, there is another sp node that has integrated with spark's services, but has not had an update on spark's system, so it should only be a matter of time.
Currently can support spark retrieval: f02948413, f02826234, f02826123

The problems we have identified so far are as follows:
1, if the CID of the sp has an invalid CID then spark will not be able to support the retrieval rate, and the invalid CID needs to be removed
2, the current boost method is easier to dock spark, but if it is still lotus then you need to manually build the index.

Currently we are conveying these experiences to the sp, I believe that soon you can see the completion of the spark retrieval rate!

@NDLABS-Leo
Copy link

@Kevin-FF-USA @galen-mcandrew
Looking forward to your replies.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Diligence Audit in Process Governance team is reviewing the DataCap distributions and verifying the deals were within standards
Projects
None yet
Development

No branches or pull requests

5 participants