evaluate backend storage strategy #293

brainchild0 · 2019-01-18T08:36:17Z

I am a newcomer to the project both as a user and one reviewing the current development activities, and I am excited about the prospect of a notes application, as well as the larger Nextcloud platform, to provide a less distracting, intrusive, and inflexible alternative to commercial services, while retaining high standards of design quality and operational reliability.

Having made a best effort to understand the current project state and future ambitions, I have come to think it appropriate now to consider the desired user experience of the notes app as it relates to files, filenames, titles, and metadata, and whether the current file-driven backend storage strategy ultimately allows that experience to be realized.

Since I am new, I hope that I am not merely creating distractions by repeating old suggestions, whether they were earlier accepted or rejected. I also respect that some may variously feel the time too soon or too late for this discussion, but I would say better early than late, and better slightly late than very late.

Currently, in the backend storage, as much as I understand, each note is a file in a directory tree, and all data related to each note are completely determined by the file contents and location. The discussion in #190 is considering whether the file may be also given a user-specified name, to serve as the note title.

Agreeing with many writing in #190, I expect that many users would be frustrated by being unable to select an explicit title. Three reasons are apparent:

As I explained earlier and at length in decouple note title from first line of contents #289, linking the first line of a file body to how it is displayed in a list breaks the common assumption that file name is used to select what information is processed, and is distinct from the information itself. This problem may have limited consequences if the notes app is the only application that processes the notes files, but then much of the benefit of the file-based representation is lost. Consider the possibility that the ultimate target of a notes file is pandoc.
MarkDown is intended for relatively short sequences of text, with support for heading structure, but not for titles. It is not suitable, without extensions, for self-contained documents, and no feature common to all current dialects provides documents with a way to specify a title that is recognized and properly formatted by a processor. Any title assigned to a MarkDown sequence must be specified externally. YAML metadata blocks are employed by pandoc, and other solutions elsewhere.
More generally, as many will intuitively realize, the assumption is overly restrictive, and often invalid, that the title of a set of structured text is necessarily equal to the first line. It may be third line, or the first line combined with the second, and so on.

A further consideration, more broad than mere titles, is that many users, I think, will be unhappy with a notes application that looks sufficiently similar to a file browser. They have no reason not to use the latter, if they find no experience in the former that more intuitively facilitates their need to constantly collect and to organize mundane text.

As such, the modes that a human intuitively organizes information, separate from how software engineers have done so, is the central concern.

Increasingly, users think of computer data as they think of physical objects, based on what they are rather than what they have been labeled. When we look for an e-mail message in our inbox, we think about the date, sender, and subject. The message headers may contain unique identifiers, but users don't know or care. The data element most similar to file name or title in an e-mail message is the subject line, but its purpose is to describe not to identify. Equally, in recent days, I have used a photo browser, a music player, and voice recorder, and while in each case the underlying assets were files, the representation in the interface was driven not by file name or location, but by content and metadata, such as thumbnail, artist name, or recording length. Details of the file tree could be shown in the application if desired, but by default were abstracted from my view.

Commonly-used note applications are similar. They generally attach metadata to each note, such as timestamp, and sometimes cached thumbnail previews. More, the ones I have used will even let me create an exact copy of any note, body and title, with no need to resolve a name collision. The duplicate appears identical to the original, including having an identical title, and is created instantly and unconditionally.

If the application is to produce an experience that allows it to compete for the same user base as these solutions, then it is difficult for me to see how the simple file-tree representation is adequate. To be sure, many commercial solutions are bloated, and it is unnecessary to consider more than a minimal set of features that makes a notes application handle the range of common uses. (Which side of this divide to place attachments is to me an open question.)

From the above concerns, I offer some suggestions:

Support titles, but don't require them for every note. Coping with either case appropriately is a presentation issue, so making it a persistence issue will cause problems. An application could, for example, represent a list of notes by title, if given, otherwise its first line, along with modification date. Automatically assigning and saving a title based on the first line of the first draft is also reasonable, but not dynamic. Redundant data need not be persisted when the user interface can dynamically resolve the missing fields such that they always reflect the current values of the fields from which they derive. A dynamic approach also means that display issues from bugs are resolved immediately upon upgrade, even for data sets created by the earlier buggy version.
If the model of one file per note is easy and appealing, then keep it. I have advocated for its advantages. But different notes ought to be able to have the same title, and all notes must have metadata, not limited to titles. Creation and modification date should be saved, and they are different from the corresponding fields in the file system. If a cloud instance is copied to a new storage volume, then the file timestamps might change, so the application-level metadata must be separate, perhaps inside a database table. If a table that holds these fields points to a filename, then each note must have a file name that is static, automatic, and unique. A hash code or timestamp-base naming scheme each work, the latter being more transparent.
If file-level client synchronization of a notes collection is useful, but files are given opaque names for reasons appearing above, then perhaps exposing a virtual file-tree view via WebDAV or REST is a better option than directly exposing the physical file tree. Maybe "file name" could be a metadata field of each note, exposed as the filename within the virtual view, to facilitate integration with client software (e.g. standalone MD editors and processors) that opens files by name. This virtual file view avoids the problem of exposing opaque file names to the client, while still allowing the application to reliably produce metadata-driven user interactions. This model also allows metadata to be represented directly in each file header, if desired, rather than in an external database table, because the header can be filtered away in the virtual view.
Support tags. While I personally like that categories can be nested as tree elements, not simply flat lists, I think that this advantage is less important than (though not in conflict with) the demand for tags, which are currently supported both by other Nextcloud apps and by many client applications written expressly for the Nextcloud notes app. In the latter case, tags are saved on the client but cannot be synchronized with the server, or with other clients, because of the limitation. If the notes app does not support tags, then it will be conspicuous in the Nextcloud ecosystem for their absence.

I apologize to anyone who finds my observations pedantic or distracting. My objective is to help suggest how the design efforts might be directed toward making the notes app as effective as possible.

Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

korelstar · 2019-01-26T13:03:21Z

Thanks for your feedback. I will give some thoughts on your suggestions:

ad 1. (separate title which can differ from filename): We have already this discussion in #190, so please don't start a new discussion again and again (see also #289).

ad 2a. (unique title): Why do you need to have notes with the same title? I think it's more convenient for users to be able to uniquely identify a note in an easy way.

ad 2b. (timestamps): You are talking about an edge case. If you're moving your data to another file system, then ensure that the modification dates are preserved.

ad 2c. (stable file names): this will be solved by 1.

ad 3. (virtual file interface): I can't see why this could be helpful. We don't want to use opaque file names.

ad 4. (tags): We have categories instead of tags, see #8. Please feel free to open a new issue for a discussion about providing tags functionality in addition to category functionality, since this topic is independent from the storage topic. But please give arguments, why this is needed and how we could integrate this without cluttering the user-interface.

brainchild0 · 2019-02-02T19:41:32Z

Thank you for responding to each of my points one after the other. Reading your responses, I fear that the larger message of my original comments may have been lost. If I respond to all of your comments, then it is likely that we will continue to talk past each other, with the larger message remaining out of reach.

I think that one of your responses, about my suggestion to drop the uniqueness requirement for titles, very compellingly represents an essential difference in the current approach from the one I am asking you to consider. I will give my thoughts on this issue only, and ask that you review them.

Notes applications, I argue, are often used in situations where time is limited, and on devices, such as mobiles, where input is slow. And yet these applications must still support a variety of use cases. One such case is quickly to record a small segment of text before forgetting it, whatever the user's current location or activity. Another is to maintain a vast collection of small, independent segments of text, without always being burdened by the need to implement a clear or precise organizational scheme.

The assumption underlying unique file names meets a different, more limited set of requirements. Among these requirements are that 1) a user will take some care to consider a useful name when creating a new resource, such that 2) that name will relate to the purpose of the resource, and 3) not relate to other resources with which it may be confused. Further, 4) applications and users will remember or record that name, to be used to retrieve the resource when needed, by resolving the path in the file system, and 5) resolving the path will be the principle means of retrieving the resource. This paradigm is an old one, dating to a time when the thought of a computer taking the function of a familiar paper notebook would have been laughable.

Notes in a notebook are conceptually different from files on a file system, regardless of how they might be stored in the backend. That a notes application must represent the particular abstractions needed to facilitate intuitive, uncoordinated, everyday thinking is a critical insight for developing an effective notes application.

Suppose a user needs to duplicate a note. Three possibilities present: 1) The user chooses a new title for the duplicate, 2) the application chooses a new title, or 3) the two notes both keep the original title. In case (1), the user is burdened by choosing a new title, Perhaps doing so is "more convenient" in the special case when a user sits in an office chair with little to do but contemplate what title to choose for the duplicate. Ultimately, though, this restriction is incompatible with the broader requirements, summarized earlier. In case (2), the user has two notes that will have diverged in content, but one having an automatic modification to the title. The specific modification, whether a number being appended, or some other automated operation, cannot possibly represent the actual reason for the duplication of the note or the details about how the content will diverge. The user is still burdened by needing to remember which title, the original or the modified one, corresponds to any particular set of changes to the content. The modification of the title has no purpose except to preserve the uniqueness of each title.

The requirement that titles are unique, so far, has been imposed, but not justified.

Case (3) remains, and avoids the earlier problems. The objection against it is, "Why do you need to have notes with the same title?" I would ask you to consider that the very question may derive from a bias from working with file systems, as this question ought not to be the first to consider. The first question, rather, ought to be, "Why do notes need to have unique titles? Why can't you have notes with identical titles?" If no answer can be given that is convincing against the the practical realities previously described, then the uniqueness requirement should be dropped, in favor of convenience and flexibility for the user. So far, no such answer has been given. Incredulity about why notes need to have the same title cannot be a justification for forcing them to have different ones. Equally, it is begging the question, merely to remind us of a choice to represent notes as files with titles as the file name, because the choices of backend storage must follow from the use case requirements, not determine them after the choices were already made arbitrarily or summarily.

Then, the broader message is that files are not notes and notes are not files. Again, notes may be represented as files. This approach has advantages, but then the relation is one of physical representation, not of sameness in concept, and certain accommodations must be made on the backend, which is I why I took the time in the original comments to discuss possibilities such as database tables and opaque file names. These particular accommodations may not be the best ones, but to avoid any accommodations altogether is, I fear, to lose the message.

Thank you for reviewing these concerns. I suggest that if you consider them carefully and openly, you might start to agree that creating a compelling design for a notes application asks us to move toward a broader approach than the one represented so far in the discussions.

brainchild0 · 2019-03-28T15:24:36Z

Further thoughts: Many application formats feature date information inside the file contents, such that the application need not rely exclusively on file timestamp. It may be wise no to dismiss the suggestion summarily.

An example among MarkDown-based applications is Jekyll, a is a lightweight CMS that utilizes YAML headers, which can include date information. File system dates are considered an unreliable source to determine the actual date associated with a document. (See Jekyll documentation.)

Similarly, an interesting blog post argues that a notes system ought to contain metadata embedded into notes that can be read by a local indexing service or application. This article refers to and supports the Jekyll strategy for metadata in MarkDown-formatted notes. (See Metadata section, as the earlier sections are less relevant to this discussion.)

Following this author's thoughts, it would be easy to imagine a feature in Nextcloud Notes that tries to resolve timestamp information from metadata inside the file. Ideas such as this one are the kind I would ask you to consider.

Of course Pandoc, which is not a notes application, is a very widely-used software package, also using this design.

korelstar added the needs discussion Need to clarify if and how we should implement this label Jan 19, 2019

brainchild0 changed the title ~~Discussion: backend storage strategy~~ evaluate backend storage strategy Jan 19, 2019

brainchild0 mentioned this issue May 1, 2020

Note usage through ~all apps #389

Open

joshtrichards added the feature request Requests for complete new features label Aug 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

evaluate backend storage strategy #293

evaluate backend storage strategy #293

brainchild0 commented Jan 18, 2019 •

edited by jospoortvliet

Loading

korelstar commented Jan 26, 2019

brainchild0 commented Feb 2, 2019

brainchild0 commented Mar 28, 2019

evaluate backend storage strategy #293

evaluate backend storage strategy #293

Comments

brainchild0 commented Jan 18, 2019 • edited by jospoortvliet Loading

korelstar commented Jan 26, 2019

brainchild0 commented Feb 2, 2019

brainchild0 commented Mar 28, 2019

brainchild0 commented Jan 18, 2019 •

edited by jospoortvliet

Loading