Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to search for file and folder name inside repo? #9005

Closed
finzzz opened this issue Nov 14, 2019 · 20 comments
Closed

How to search for file and folder name inside repo? #9005

finzzz opened this issue Nov 14, 2019 · 20 comments
Labels
type/proposal The new feature has not been accepted yet but needs to be discussed first.

Comments

@finzzz
Copy link

finzzz commented Nov 14, 2019

I have enabled repo indexer so that I can search code inside files of certain repo. But I also want to search for file/folder name, how do I do that?

I have tried **keyword, **keyword**, **.py, etc. But it didn't work.

@lunny lunny added the type/proposal The new feature has not been accepted yet but needs to be discussed first. label Nov 15, 2019
@lunny
Copy link
Member

lunny commented Nov 15, 2019

It's not supported yet.

@guillep2k
Copy link
Member

Sorry for the cross-post:
It should not be that difficult to add file names to the repo indexer as indexed words; file names in this case should probably skip the indexer glob filter?

@ili101
Copy link

ili101 commented May 5, 2020

Any progress on this?
Or workaround available?

@guillep2k
Copy link
Member

@ili101 we gladly welcome PRs. 😁

@ili101
Copy link

ili101 commented May 6, 2020

@ili101 we gladly welcome PRs. 😁

That's an awesome thing! unforgivably Go is not one of my expertise (yet) 😁

@love1900905
Copy link

+1, for many time, filename/pathname is even more meaningful than file content.

@rengui
Copy link

rengui commented Dec 17, 2021

First of all, thanks great gitea! I'm using gitea heavyly and it works lightweightly and smoothly for years!

Back to this topic, I need this feature as well recently. After kinds of struggling,

It was resolved today, by change 1 line of code (gitea branch 1.15.7, for bleve search only):

// change modules\indexer\code\elastic_search.go if you are using ES.
modules\indexer\code\bleve.go

func (b *BleveIndexer) addUpdate(batchWriter ...
	......
	return batch.Index(id, &RepoIndexerData{
	RepoID:   repo.ID,
		CommitID: commitSha,
		Content:   string(charset.ToUTF8DropErrors(fileContents)),    // <--- this line

==>

	// Content:   string("Pathname: ") + string(update.Filename) + string(" \n") + string(charset.ToUTF8DropErrors(fileContents)),
	// replace '/' with ' / ', so that bleve think  '/foo/bar.txt' as  '/ foo / bar.txt'
	//   (then can be searched by both 'foo' and 'bar'; otherwise bleve only like '/foo/bar.txt' ???)
	Content:   string("Pathname: ") + strings.ReplaceAll(string(update.Filename), "/", " / ") + string(" \n") + string(charset.ToUTF8DropErrors(fileContents)),

then build your own gitea.exe and enjoy it :)

After replace this gitea.exe, you may need to delete all the code search index files (app.ini: REPO_INDEXER_PATH = ), restart gitea to make a full indexing (with pathname search enabled)

@lunny
Copy link
Member

lunny commented Dec 17, 2021

First of all, thanks great gitea! I'm using gitea heavyly and it works lightweightly and smoothly for years!

Back to this topic, I need this feature as well recently. After kinds of struggling,

It was resolved today, by change 1 line of code (gitea branch 1.15.7, for bleve search only):

// change modules\indexer\code\elastic_search.go if you are using ES. modules\indexer\code\bleve.go

func (b *BleveIndexer) addUpdate(batchWriter ...
	......
	return batch.Index(id, &RepoIndexerData{
	RepoID:   repo.ID,
		CommitID: commitSha,
		Content:   string(charset.ToUTF8DropErrors(fileContents)),    // <--- this line

==>

	// Content:   string("Pathname: ") + string(update.Filename) + string(" \n") + string(charset.ToUTF8DropErrors(fileContents)),
	// replace '/' with ' / ', so that bleve think  '/foo/bar.txt' as  '/ foo / bar.txt'
	//   (then can be searched by both 'foo' and 'bar'; otherwise bleve only like '/foo/bar.txt' ???)
	Content:   string("Pathname: ") + strings.ReplaceAll(string(update.Filename), "/", " / ") + string(" \n") + string(charset.ToUTF8DropErrors(fileContents)),

then build your own gitea.exe and enjoy it :)

After replace this gitea.exe, you may need to delete all the code search index files (app.ini: REPO_INDEXER_PATH = ), restart gitea to make a full indexing (with pathname search enabled)

A general method is to add a new field Filename in RepoIndexerData.

@rengui
Copy link

rengui commented Dec 17, 2021

A general method is to add a new field Filename in RepoIndexerData.

Yes you are right, from gitea deve pov, it shall go in a general way, ensure code quality and extensibility.
What I mentioned was just from a gitea user pov, try to hack and get a working WA asap. :)

@delanym
Copy link

delanym commented Jun 30, 2022

@lunny why cant your workaround be a PR?

delanym added a commit to delanym/gitea that referenced this issue Jul 4, 2022
@wxiaoguang
Copy link
Contributor

wxiaoguang commented Jul 4, 2022

FYI, there is a new feature: Go to file

It should be more convenient if you know you are searching for file names:

image

image

@delanym
Copy link

delanym commented Jul 4, 2022

@wxiaoguang thanks this does the expected.
It's a little slow on large repos the each time I bring up the page to do a search. What configuration options are there?

@wxiaoguang
Copy link
Contributor

wxiaoguang commented Jul 4, 2022

It's a little slow on large repos the each time I bring up the page to do a search.

Yup, it's not optimized yet. How large is your repo? For linux kernel repo (4G, 80k files) is about ten seconds on my side.

What configuration options are there?

What's the configuration do you mean? This feature doesn't have a config option yet, it's in 1.17 release.

@delanym
Copy link

delanym commented Jul 4, 2022

@wxiaoguang config to enable a cache. I can cope with 10sec the first time the repo is indexed, but not every time I search for a file.
Im testing on a repo/branch with 86000 files - it's more like 15sec.
The equivalent find takes 0.18s: find . -iname "*IdleState*"

Also it should treat the search query as a sequence of strings, not a sequence of chars. Currently a search for "IdleState" returns a file like
IssuingSystem/Modules/UserControls/Views/ShortcutBarPresenter.cs

And actually - if I know anything about grep - its probably slower because its searching those individual characters.

@oetiker
Copy link

oetiker commented Jul 4, 2022

In our setup, we would love to be able to search for a filename across all repos ...

@wxiaoguang

This comment was marked as off-topic.

@michaelfresco
Copy link

@delanym
I was looking at the code from Lunny but, this function looks a bit different at the moment. Does anyone know how to implement the fix right now?

func (b *BleveIndexer) addUpdate(ctx context.Context, batchWriter git.WriteCloserError, batchReader *bufio.Reader, commitSha string,

@delvh
Copy link
Member

delvh commented Apr 29, 2023

This has already been implemented for some time (I think since 1.18?):
image
with the following dialogue
image

@delvh delvh closed this as completed Apr 29, 2023
@wxiaoguang
Copy link
Contributor

Hmm, IIRC this issue is asking about "repo indexer" , the "goto file" could help in some cases but not 100% resolves the issue.

@delvh
Copy link
Member

delvh commented Apr 30, 2023

Ah, you mean a global search through all repos?
Yes, that isn't possible yet.
However, this issue read like Go to file to me as it does exactly what this issue is asking for (search for file and folder names inside repo)…

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jun 15, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
type/proposal The new feature has not been accepted yet but needs to be discussed first.
Projects
None yet
Development

No branches or pull requests