-
Notifications
You must be signed in to change notification settings - Fork 303
Google Summer of Code 2024
PySAL is inviting students to join in PySAL's development by applying for Google Summer of Code 2024. This is the eigth year PySAL will be seeking to participate.
PySAL is an open source library of spatial analysis functions written in Python intended to support the development of high level applications. See our documentation for more details. The developer guide describes in more details how to make contributions to PySAL and our work flow for contributing to the project. Our issues are also on github, which include bug reports and 'wishlist' items and enhancement plans and ideas.
If you are interested in participating in GSoC as a student, the best approach is to become an active and engaged contributor to the project right away. You should take a look at some of the existing issues on GitHub and see if there are any you think you might be able to take a crack at. Try submitting a pull request for something and start getting the hang of the process and interacting with the PySAL code base and development community. It is a good idea to start on your proposal early, post a draft to the pysal chat room and iterate based on the feedback you receive. This will not only improve the quality of your proposal, but also help you find a suitable mentor.
Below is a listing of possible projects that students might consider. We also encourage students to propose their own projects, though several of the following topics are relatively high on our priority list. Our priority list is flexible, and it is important that the topic matches the interest and background of the student.
When considering the following projects, don't be put off by the knowledge prerequisites -- you don't need to be an expert, and there is some scope for research and learning within the GSoC period. However, familiarity with and interest in the subject area and involved technologies will be helpful!
Exploratory Spatial Data Analysis (ESDA) is one of the most important steps in understanding spatial data. There have been a range of statistics proposed over the years, with more on the horizon. However, there remains a need to implement these statistics in contemporary programming languages. PySAL is currently looking for a GSoC student that would be interested in implementing statistics like:
-
multinomial join counts (i.e. multiclass/color) (
intermediate
) -
CAGE statistic criterion for aggregation error (paper) (
hard
) -
local gamma statistic (
easy
) -
Local Spatial Dispersion (LSD) (paper) (
intermediate
) -
Rathelot's Index (paper) (
easy
) -
Distance decay/variogram estimators (
intermediate
) -
Cleaining up, finishing permutation inference, and testing Lee's local spatial pearson statistic and local aspatial pearson (
easy
) -
Local Tau migration from
giddy
. (easy
) -
Local neighbour match test (
easy
) -
Expanding reporting capabilities of existing statistics in the submodule (e.g. Saddlepoint approximation for local moran, bootstrap inference for LOSH (paper), and others methods implemented, for example, in
spdep
) (easy
) -
Alternative p-value computation methods for simulated statistics (
easy
)
dependent on the statistic, anywhere from easy
to hard
. Consult difficulty ratings on statistics themselves.
User-class implementations of a set of the above statistics
- knowledge of Python (required)
- introductory statistics (required)
- spatial statistics experience (preferred)
- Levi Wolf (@ljwolf)
- Serge Rey (@sjsrey)
Depends on the number and kind of estimators chosen.
2 estimators will take ~175 hours, 4+ will take ~350 hours.
Facility location modeling is critically important for both public- and private-sector application, planning, and decision-making contexts. Currently, most of the facility location modeling has been implemented by commercial software such as the Location-Allocation tool in ArcGIS Network Analyst. Some of the open-source tools have been developed in both Python and R on this topic such as PySpatialOpt and Maxcovr. However, the comprehensive development of an open-source facility location modeling toolset is needed to reduce the barrier to implement facility location modeling in the future.
We had a very successful GSOC project for the spopt
package from last year and we developed 4 basic models including LSCP, MCLP, P-median, and P-centre. This year PySAL is currently looking for a GSoC student that would be interested in further improve the spopt package to implement models like:
- Backup Coverage Location Problem
- p-Dispersion (dispersion model) under different neighborhood restriction and max-min-min dispersion
- implement model components such as facility capacity, demand unit shape, distance metric, and solution approach by incorporating open-source GIS(networkX, geopandas, shapely)
- see examples in example1, example2, and example3
- Open-Source Approaches for Location Coverage Modelling
- Business site selection, location analysis, and GIS
- Location covering models: history, applications and advancements
- interest in facility location modeling and spatial optimization
- knowledge and experience with facility location modeling theories and optimization solvers are preferred
intermediate
350 hours
Ready-to-use implementations of a subset of optimization models above
- knowledge of Python (required)
- Linear algebra (preferred)
- Linear programming (preferred)
- Spatial Optimization (preferred)
PySAL is an open source project and as such we invite contributions from any interested developer. If you have an idea for an enhancement for PySAL please contact one of the developers to discuss the possibilities for the project in GSOC24.