Evergreen: Exploration Data Science
Evergreen: Generate new knowledge by predicting all Australian mineral deposits
This Evergreen challenge is designed to help data scientists get familiar with datasets used for exploring for ore deposits and prepare for the ExploreSA challenge, using Australia-wide data.
This challenge was originally launched as part of the OZ Minerals Explorer Challenge in 2019.
This challenge provides a far greater number of training points for your model, compared to the ExploreSA challenge, however the data is sparser.
You are welcome and encouraged to use your findings and work from this challenge for the ExploreSA challenge.
A leaderboard is provided for benchmarking and feedback purposes.
The challenge is designed to help data science teams, master geological uncertainty and predictive accuracy to help work on a difficult real world problem!
A plethora of geoscientific data has been collected across Australia, and is available through public repositories, including the National Computational Infrastructure (NCI), the state surveys, and Geoscience Australia.
In this challenge, you are invited to an open season on this data. We have selected more than 3,000 identified deposit locations from Geoscience Australia, with data layers of key geoscientific information, for a variety of deposit types and styles.
Mineralisation processes, including mountain building, earthquake activity and groundwater flow for example, are a tightly-coupled web of causes and effects spanning from the nanoscale to the radius of the earth, over timescales from microseconds to billions of years. This complexity means that data markers and hypotheses are rarely simple and quickly proven, and require the analysis of many layers of geoscientific information by geology teams.
Using data science to analyse deposit locations and data layers, can you build a higher level model to predict mineralisation?
You can develop your model using any approach you like, with any data you like, and can use our leaderboard function to get feedback on your algorithm as you go.
To win, you will need to demonstrate the technical capabilities of your model (e.g. Jupyter notebook, visualisation demo), and convince us of its usefulness. No points are allocated to leaderboard position - this is provided for your feedback only.
You are welcome to use your findings and work from this challenge for the ExploreSA challenge.
We've taken 'postage stamps' of 25 x 25 km areas of Australia, and extracted geology, geophysics and aster coverages for these areas. Some of the areas have one or more mineral deposits (up to 42 deposits in one case near Kalgoorlie, Western Australia), while some do not.
Each dataset has layers organised under a data ID - an 8 digit identifier.
The stamps have all been regridded and reprojected to a local oblique mercator grid, so that the x and y coordinates are in metres from the centre of the grid. You can certainly undo this pretty easily with a bit of image matching or database lookup but it's really not worth it because we're not using the scoreboard as part of the judging - more as benchmarking and feedback for your models. You'll need to convince us that you haven't overfit to the leaderboard as part of your submission! The stamps have been generated using the code in the Unearthed explore_australia repository.
We've created a kick-starter repo of Python functions to pull coverage data from the National Computational Infrastructure’s data service endpoints so you can get useful subsets of data from continent-wide coverages of geology, geophysics and remote sensing. You’re not restricted to using this data - any other public data is fair game (but you will need to tell us what data you end up using in your final models).
Fork the repo on Github here.
In addition to the raster coverages, Geoscience Australia also provides digital geological maps as shapefiles. There are starter functions for reading and munging this data in the repository. You can download these on data.gov.au.
Make sure you also take a look at the data portals of the other state and federal geological surveys for tons of useful data, including regolith, downhole and hydrogeochemistry, and other geophysical surveys. For starters, try:
- Geological Survey of South Australia
- Geoscience Australia Data and Publications
- Geological Survey of Western Australia
- National map
As targets, we've provided 1,863 deposit locations gleaned from Geoscience Australia's Identified Mineral Resources database. We’ve done some cleaning and simplifying of these target locations; for more information, see the readme here.
Download the targets:
- As GeoJSON: https://github.com/unearthed/explore_australia/blob/master/data/deposit_locations.geo.json
- As CSV: https://github.com/unearthed/explore_australia/blob/master/data/deposit_locations.csv
To make it easy, we have created a jump-starter repo to quickly access the data - so that you can get straight into building your models and creating new insights.
Fork the repo on Github here.
Any and all public data can be used for this challenge, but you will need to tell us what data you end up using in your final models.
A public leaderboard will be provided for your own live feedback and benchmarking purposes.
You can submit models as many times as you like.
Discoveries are tied to investment they tend to get announced fairly quickly, so there’s not a secret dataset that could be used as a hidden test dataset to prevent cheating, but we hope the leaderboard still helps you to benchmark your model and progress!
Once registered, you will have access to a forum. Subject matter experts will respond to questions asked here.
The forum can also be used to discuss the competition with other participants, as can our Facebook community.
How to make a submission
Innovators may submit individually or as a team.
As an individual:
If you would like to submit as an individual, you will need to create a team of just yourself using the "Create Team"orange button to the right of the Explorer Challenge heading at the top of this page.
Once you've finished setting up your team, you can continue on to "Edit Submission" when ready.
As a group:
All team members must register and accept the terms and conditions.
Once you are ready to create a team, or plan to submit as an individual, create your team using the orange button at the top of this page.
You can invite teammates via email. Other teammates can be add up until submissions close, using the orange Edit Submission button (replaces Create Team button once team started).
Submissions and Judging
You can develop your model using any approach you like, and can use our leaderboard function to get feedback as you go.
Using the Submit Score button, you can submit your model for benchmarking against our test dataset.
On the Submit Score page, you can submit as many times as you like for benchmark feedback.
Just Joining Us at Unearthed?
We have a brief "Getting Started Guide" here for new Unearthed members.