Improving smart-forestry through machine learning
When Rolf Schmitz and the CollectiveCrunch team started out, they knew they wanted to combine AI with their other passion for understanding climate and how it impacts our daily lives, but weren’t yet sure how to bring it all together into a viable business.
At the end of the day, we‘re able to match supply and demand much better than any competitor thanks to Valohai boosting our model development speed by a factor of 2 to 5!Rolf Schmitz – CEO, CollectiveCrunch
At first, Rolf and the team were considering modeling and predicting air quality in larger cities in the same way that weather prediction is done today. By combining massive amounts of exclusive municipal data – like micro weather, traffic, cameras – CollectiveCrunch knew they could build accurate models to predict air quality. Unfortunately, they soon realized their plan had a flaw that kills many great ideas: it lacked a monetization model consumers or cities would be ready to pay for.
Luckily, it wasn’t long before the CollectiveCrunch team stumbled onto a majorly underserved vertical in the forestry industry where ML’s efficient predictions had the potential to make a massive impact - both on a company’s bottom line, but also on the environment itself through efficient natural resource management.
Predicting wood inventories for the forest supply chain
CollectiveCrunch’s main product, Linda Forest, helps the forestry industry better understand and target the raw materials they are buying. Essentially, it’s a method for minimizing risks surrounding forest industry assets and transportation of too many or few truckloads to the factories per day.
The firm has aimed its service at large players in the industry, such as forest funds who sell logs and large sawmills or pulp and paper producers who process logs. A modern pulp mill can have a supply chain of 300 or more trucks per day delivering logs. Getting the volume or quality of such deliveries wrong is a major pain point that Linda Forest addresses.
Their main customers are buyers and sellers of wood-based raw materials (logs) who want to know the quantity and quality of wood coming into the mills every day. Conventional prediction solutions for log and stem size, tree species and more struggle with a substantial margin of error. Traditional solutions also require somebody drive out into the woods to do manual measurements. CollectiveCrunch solves the problem by giving the industry better prediction accuracy and reducing the time spent on manual inspection.
CollectiveCrunch makes use of a wide array of space data sets, optical and other, as well as LIDAR and process data to accurately predict wood quality and quantity. They’ve built a highly-scalable cloud-based GIS solution that is offered as a SaaS package, that provides real-time market intelligence at the click of a button.
Our challenge is largely which model approach is going to work where, so it is a lot of data crunching, testing different models and picking the best one for the job. Valohai makes the process a lot easier for us thanks to parallel hyperparameter searches!Rolf Schmitz – CEO, CollectiveCrunch
Building on Valohai – Boosting model development time by a factor of five!
Since their inception, CollectiveCrunch has employed a combination of internal team members, and a network of freelancers for its analytics work. Thanks to Valohai, they’ve been able to balance model training over several teammates while enjoying peace of mind that Valohai’s automatic version control ensures that nothing’s ever lost to turnover. Valohai’s complete historical data gives a clear map of what everyone has done, regardless if the experiment was run a week ago or a decade ago.
The team at CollectiveCrunch is constantly testing a wide array of algorithms and tunes them for the best fit. Valohai allows the teammates working on a problem to collaborate, trying out different strategies and learning from each other’s experiments. The separation of steps into machine learning pipelines has also helped them split work into image cleaning, data normalization, and then separately model training.