Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Settings Simplification ADR #36224

Open
wants to merge 9 commits into
base: master
Choose a base branch
from
355 changes: 355 additions & 0 deletions docs/decisions/0022-settings-simplification.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,355 @@
Django settings simplification
##############################

Status
******

Accepted

Implementation tracked by: https://github.com/openedx/edx-platform/issues/36215

Context
*******

OEP-45 declares that sites will configure each IDA's (indepently-deployable
application's) Django settings with an ``<APPNAME>_CFG`` YAML file, parsed and
loaded by a single upstream-provided ``DJANGO_SETTINGS_MODULE``. This contrasts
with the Django convention, which is that sites override Django settings using
their own ``DJANGO_SETTINGS_MODULE``. The rationale was that all Open edX
setting customization can be reasonably specified in YAML; therefore, it is
operationally safer to avoid using a custom ``DJANGO_SETTINGS_MODULE``, and it
is operationally desirable for all operation modes to execute the same Python
module for configuration. This was `briefly discussed in the oep-45 review
<https://github.com/openedx/open-edx-proposals/pull/143#discussion_r411180111>`_.

For example, in theory, the upstream production LMS config might be named
``lms/settings/settings.py`` and work like this:

* import ``lms/settings/required.py``, which declares settings that must be
overridden.
* import ``lms/settings/defaults.py``, which defines reasonable defaults for
all other settings.
* load ``/openedx/config/lms.yml``, which should override every setting
declared in required.py and override some settings defined in defaults.py.
* apply some minimal merging and/or conditional logic to handle yaml values
which are not simple overrides (e.g., ``FEATURES``, which needs to be
merged).

The upstream production CMS config would exist in parallel.

However, as of Sumac, we do not know of any site other than edx.org that
successfully uses only YAML files for configuration. Furthermore, the
upstream-provided ``DJANGO_SETTINGS_MODULE`` which loads these yaml files
(``lms/envs/production.py``) is not simple: it declares defaults, imports from
other Django settings modules, sets more defaults, handles dozens of special
cases, and has a special Open-edX-specific "derived settings" mechanism to
handle settings that depend on other settings.

Tutor does provide YAML files, but *it also has custom production and
development settings files*! The result is that we have multiple layers of
indirection between edx-platform's common base settings, and the Django
settings rendered into the actual community-supported Open edX distribution
(Tutor). Specifically, production edx-platform configuration currently works
like this:

* ``lms/envs/tutor/production.py``...

* is generated by Tutor from the template
``tutor/templates/apps/openedx/settings/lms/production.py``,

* which derives
``tutor/templates/apps/openedx/settings/partials/common_lms.py``,

* which derives
``tutor/templates/apps/openedx/settings/partials/common_all.py``;

* and uses templates vars from Tutor configuration (``config.yml``),

* and invokes hooks from any enabled Tutor plugins;

* it imports ``lms/envs/production.py``,

* which imports ``lms/envs/common.py``,

* which sets production-inappropriate defaults;

* it sets more defaults, some of them edX.org-specific;

* it loads ``/openedx/config/lms.yml``...

* which is generated by Tutor from template
``tutor/templates/apps/openedx/config/lms.env.yml``

* which derives
``tutor/templates/apps/openedx/config/partials/auth.yml``;

* it reverts some of ``lms.yml`` with new "defaults";

* and it uses certain values from ``/openedx/config/lms.yml`` to
conditionally override more settings and update certain dictionary
settings, in a way which is not documented.

* ``cms/envs/tutor/production.py``...

* is generated by Tutor from the template
``tutor/templates/apps/openedx/settings/cms/production.py``,

* which derives
``tutor/templates/apps/openedx/settings/partials/common_cms.py``,

* which derives
``tutor/templates/apps/openedx/settings/partials/common_all.py``;

* and uses templates vars from Tutor configuration (``config.yml``),

* and invokes hooks from any enabled Tutor plugins;

* it imports ``cms/envs/production.py``,

* it imports ``cms/envs/common.py``, which sets production-inappropriate
defaults,

* and which imports ``lms/envs/common.py``, which also sets
production-inappropriate defaults;

* it sets more defaults, some of the edX.org-specific;

* it loads ``/openedx/config/cms.yml``...

* which is generated by Tutor from template
``tutor/templates/apps/openedx/config/cms.env.yml``

* which derives
``tutor/templates/apps/openedx/config/partials/auth.yml``;

* it reverts some of ``/openedx/config/cms.yml`` with new "defaults";

* and it uses certain values from ``/openedx/config/cms.yml`` to
conditionally override more settings and update certain dictionary
settings, in a way which is not documented.

This is very difficult to reason about. Configuration complexity is frequently
cited as a chief area of pain for Open edX developers and operators.
Discussions in the Named Release Planning and Build-Test-Release Working Groups
frequently are encumbered with confusion and uncertainty of what the default
settings are in edx-platform, how they differ from Tutor's default settings,
what settings can be overriden, and how to do so. Only a minority of developers
and operators fully understand the configuration logic described above
end-to-end; even for those that do, following this override chain for any given
Django setting is time-consuming and error-prone. CAT-1 bugs and high-severity
security vulnerabilities have arisen due to misunderstanding of how
edx-platform Django settings are rendered.

Developers are frequently instructed that if they need to override a Django
setting, the preferred way to do so is to "make a Tutor plugin". This is a
large amount of prior knowledge, boilerplate, and indirection, all required
to simply do something which Django provides out-of-the-box via a custom
``DJANGO_SETTINGS_MODULE``.

Finally, it is worth nothing that all the complexity and toil exists alongside
other edx-platform configuration methods, such as Waffle, configuration models,
site configuration, XBlock configuration, and entry points. Those configuration
pathways are outside of the scope of this ADR, but are mentioned to demonstrate
the distressing level of complexity that developers and operators face when
working with the platform.

Decision & Consequences
***********************

Overview
========

We orient edx-platform towards using standard Django settings configuration
patterns. Specifically, we will make it easy for operators to override settings
by supplying a custom ``DJANGO_SETTINGS_MODULE``.

Moving towards this goals will need to be an iterative and careful process,
and it's likely that some aspects of the target structure or plan (described
below) will need to updated along the way. Nonetheless, once it becomes clear
that we are landing on a solid settings structure for edx-platform, we'll
propose an OEP-45 update to generalize the structure to all deployable Open edX
Django applications.

Finally, based on what we learn throughout this process, our OEP-45 propsal
will either recommend to:

1. Drop support for the ``<APPNAME>_CFG`` YAML files, or

2. Simplify the ``<APPNAME>_CFG`` YAML schema, document it, and clarify that it
is an optional alternative to ``DJANGO_SETTINGS_MODULE`` rather than the
required/preferred configuration method.

Target settings structure for edx-platform
==========================================

* ``openedx/envs/common.py``: Define as much shared configuration between LMS
and CMS as possible, including: (a) where possible, annotated definitions of
edx-platform-specific settings with *reasonable, production-ready* defaults;
(b) otherwise, annotated definitions of edx-platform-specific settings (like
secrets) with *obviously-wrong* defaults, ensuring they aren't used in
production; and (c) reasonable production-ready overrides of third-party
settings, ideally with explanatory comments (but not annotations). When a
particular setting's default should depend on the *final* value of another
setting, the former should be assigned to a
``Derived(...)`` value, where ``...`` is a computation based on the latter.

* ``lms/envs/common.py``: Extend ``openedx/envs/common.py`` to create, as
much as possible, a production-ready settings file for the LMS. These
extension may include: (a) annotated definitions of LMS-specific settings
with production-ready defaults; (b) annotated definitions of LMS-specific
settings with obviously-wrong defaults; and (c) LMS-specific
overrides of settings defined in ``openedx/envs/common.py`` and of
third-party settings, ideally with explanatory comments (but not
annotations). Again, ``Derived`` settings can be used as appropriate. This
will be the default settings file for running LMS management commands,
although tools can override this (as usual) by specifying a
``DJANGO_SETTINGS_MODULE``.

* ``lms/envs/test.py``: Override LMS settings for unit tests. Should work
in a local venv as well as in CI. Needs to invoke ``derive_settings`` in
order to render all previously-defined ``Derived`` settings.

* ``<third_party_repo>/lms_prod.py`` (example path): In order to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really clear on the consequences of this for us, and what this will entail. Not sure if it is worth adding some notes under a consequences section.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have attempted to explain the consequences in this bullet point. Can you be more specific about what's unclear?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @kylecrawshaw. A couple of thoughts:

  1. First, I realized that the term third-party was confusing to me. In the context of the Open edX platform, I'm used to the term third-party being applied to libraries (e.g. Django, DRF) and tools. Even the term third-party provider, I would typically associate with examples like: AWS, Azure, Confluent, etc., and not with Open edX providers, unless specifically designated.
  • a. Below you used the phrase "third-party providers (like edx.org)", and that was clear to me, but only because the example was edx.org, and it took me a moment to re-orient myself. I'm wondering if switching this to "Open edX providers (like edx.org)", or something like that, would be more clear?
  • b. The reference <third_party_repo> is again ambiguous (on its own, and it is opening this paragraph), and maybe this could be <open_edx_provider_repo>? I also realized that you wanted a term that would cover tools like Tutor. Maybe <open_edx_provider_or_tool_repo>?
  • c. Related, there are a bunch of references to third-party settings in this ADR, and now I'm not clear which are third-party library settings, and which are Open edX provider settings? Maybe you could search for third-party through the ADR and double-check whether being slightly more explicit will bring clarity?
  1. This step is written as if it is required, but the consequences section better details the possible alternative of using lms/envs/yaml.py. Maybe that was part of my original confusion? Is there some way to make this clear, if I am understanding correctly?

deploy the LMS, third-party providers (like edx.org) and tools (like
Tutor) will need to separately maintain their own custom settings module
derived from ``lms/envs/common.py``, and point their
``DJANGO_SETTINGS_MODULE`` environment variable at this module. It is
important that this module both (i) replaces the obviously-wrong settings
with appropriate production settings, and (ii) invokes
``derive_settings`` to render all previously-defined ``Derived`` settings.

* ``lms/envs/yaml.py`` (only if we decide to retain YAML support):
An upstream-maintained alternative to
``<third_party_repo>/lms_repo.py>``. Loads overrides from a YAML file at
``LMS_CFG``, plus some well-defined special handling for mergable values
like ``FEATURES``. This is adapted from and replaces
lms/envs/production.py. It will invoke ``derive_settings``.

* ``lms/envs/dev.py``: Override LMS settings so that it can run
"bare metal" directly on a developer's local machine using debug-friendly
settings. Will use ``local.openedx.io`` (which resolves to 127.0.0.1) as
a base domain, which should be suitable for third-party tools as well. It
will invoke ``derive_settings``.

* ``<third_party_repo>/lms_dev.py`` (example path): In order to
run the LMS, third-party tools (like Tutor, and 2U's devstack) will
need to separately maintain their own custom settings module derived
from ``lms/envs/dev.py``, and point their
``DJANGO_SETTINGS_MODULE`` environment variable at this module.

* ``cms/envs/common.py``

* ``cms/envs/test.py``

* ``<third_party_repo>/cms_prod.py`` (example path)

* ``cms/envs/yaml.py`` (only if we decide to retain YAML support)

* ``cms/envs/dev.py``

* ``<third_party_repo>/cms_dev.py`` (example path)

Plan of action
==============

These steps are non-breaking unless noted.

* Introduce a dump_settings management command so that we can more easily
validate changes (or lack thereof) to the terminal edx-platform settings
modules.

* Improve edx-platform's API for
deriving settings, as we are about to depend on it significantly more than we
currently do. This is a potentially BREAKING CHANGE to any third-party
settings files which imported from ``openedx.core.lib.derived``.

* Remove redundant overrides in (cms,lms)/envs/production.py. Use Derived
settings defaults to further simplify the module without changing its output.

* Create openedx/envs/common.py, ensuring that any annotations defined in it
are included in the edx-platform reference docs build. Move settings which
are shared between (cms,lms)/envs/common.py into openedx/envs/common.py. This
may be iteratively done across multiple PRs.

* Find the best production-ready defaults between both
(lms,cms)/envs/production.py and Tutor's production.pys, and "bubble" them up
to (openedx,cms,lms)/common.py. Keep (lms,cms)/envs/production.py unchanged
through this process. This is a BREAKING CHANGE for any operator that derives
from (lms,cms)/envs/common.py directly. Most operators derive from
(lms,cms)/envs/production.py, so we do not expect this to affect many sites,
if any.

* Develop (cms,lms)/envs/dev based off of (cms,lms)/envs/common.py.
Iterate until we can run "bare metal" development server for LMS and CMS
using these settings.

* Deprecate and remove (cms,lms)/envs/devstack.py. This is a BREAKING CHANGE to
third-party development tools (like Tutor and 2U's devstack), as they will
now either need to maintain local copies of these modules, or "rebase"
themselves onto (lms,cms)/envs/dev.py.

* Propose and, if accepted, implement an update to OEP-45 (Configuring and
Operating Open edX). `Progress on this update is tracked here`_. As mentioned
in the Decision section, this update will either:

1. Revoke the OEP-45 sections regarding YAML. Deprecate and remove
(cms,lms)/envs/production.py. This is a BREAKING CHANGE for tools and
providers that use these settings modules, as they will either need to
maintain local copies of these modules, or "rebase" their internal
settings modules onto (cms,lms)/envs/common.py. Update operator
documenation as needed.

2. Update OEP-45 to clarify that YAML configuration is
optional. Operators can opt out of YAML by deriving directly from
(cms,lms)/envs/common.py, or they can opt into YAML by using
(cms,lms)/envs/yaml.py. Document a simplified YAML schema in OEP-45.
There will be several well-communicated BREAKING CHANGES in YAML behavior
in order to achieve the simplified schema. Furthermore, the rename of
(cms,lms)/envs/production.py to (cms,lms)/envs/yaml.py will be a BREAKING
CHANGE.
Comment on lines +291 to +309
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. How will this sub-decision get made, and will this ADR be updated to reflect that commitment?
  2. Could this be made less ambiguous at this time? It seemed to me that the current thought is to support a yaml.py, with the caveat that we might learn something along the way that convinces us to deprecate this and push it to Open edX providers. If this is accurate, could we document it? If not, maybe you add an Out of Scope section to this ADR, and detail that it is unknown whether yaml.py will be supported in the platform, or deprecated in favor of Open edX providers adding in this functionality to their own settings files as needed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. How will this sub-decision get made, and will this ADR be updated to reflect that commitment?

The ADR currently says:

Finally, based on what we learn throughout this process, our OEP-45 proposal will either recommend to:

  1. ...
  2. ...

which I felt was enough commitment and detail for an ADR. As with any OEP update, there will be an opportunity to debate the proposed decisions.

Could this be made less ambiguous at this time? It seemed to me that the current thought is to support a yaml.py, with the caveat that we might learn something along the way that convinces us to deprecate this and push it to Open edX providers. If this is accurate, could we document it? If not, maybe you add an Out of Scope section to this ADR, and detail that it is unknown whether yaml.py will be supported in the platform, or deprecated in favor of Open edX providers adding in this functionality to their own settings files as needed.

I think this is an area where you, I, and @feanil all feel differently. Personally, I have felt, and still feel, that production.py is so complex and bespoke that deprecating it and pushing it to Open edX providers will actually be easier for all involved parties than performing (and reacting to) the surgical breaking changes that'd be necessary to get it to a point of having a simple, well-documented, and generally-useful "yaml.py". But, I could be wrong, and I don't think we will be able to come reach consensus on this until we have already gotten our hands dirty coding. Hence, the ambiguity.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @kylecrawshaw. I too obviously don’t know how this will play out. I’m still left wondering if there is a better way to document all this, including the context in these comments?

One idea is to capture this as its own issue, and have the ADR refer to the issue. This would have two benefits. First, it would push the lack of decision outside of an ADR, which is meant to capture decisions. Second, if you want to capture more context or notes on the subject, it gives a home for that discussion. And it feels more natural to update that issue with more information as it is learned, than this decision record.

That said, if you prefer the current take and don’t like this proposal, I don’t think it is worth discussing further. So, please treat this is a completely non-blocking proposal.


* Create tickets to achieve a similar OEP-45-compliant settings structure in
any IDAs (independently-deployable applications) which exist in the openedx
GitHub organization, such as the Credentials service.

.. _Progress on this update is tracked here: https://github.com/openedx/open-edx-proposals/issues/587

Alternatives Considered
***********************

One alternative settings structure
==================================


Here is an alternate structure that would de-dupe any shared LMS/CMS dev & test
logic by creating more shared modules within openedx/envs folder. Although
DRYer, this structure would increase the total number of edx-platform files and
potentially encourage more LMS-CMS coupling. So, we will not pursue this
structure, but will keep it in mind as an alternative if we enounter
difficulties with the plan laid out in this ADR.

* ``openedx/envs/common.py``

* ``lms/envs/prod.py``

* ``$THIRD_PARTY/lms/production.py``

* ``cms/envs/prod.py``

* ``$THIRD_PARTY/cms/production.py``

* ``openedx/envs/test.py``

* ``lms/envs/test.py``

* ``cms/envs/test.py``

* ``openedx/envs/dev.py``

* ``lms/envs/dev.py``

* ``$THIRD_PARTY/lms/dev.py``

* ``cms/envs/dev.py``

* ``$THIRD_PARTY/cms/dev.py``
Loading