a-faster-way-to-solve-complex-planning-problems

When certain commuter trains reach the terminus, they have to proceed to a switching platform to be redirected so that they can leave the station subsequently, often from a different platform than the one they arrived at.

Engineers utilize software applications known as algorithmic solvers to plan these movements, but at a station with thousands of arrivals and departures each week, the issue becomes too intricate for a conventional solver to resolve all at once.

Through the use of machine learning, MIT researchers have created an enhanced planning system that decreases the solving time by as much as 50 percent and yields a solution that aligns more closely with a user’s goals, such as timely train departures. This novel method could also be applied to effectively tackle other complicated logistical challenges, including scheduling hospital personnel, assigning airline teams, or distributing tasks among factory equipment.

Engineers typically segment these types of challenges into a series of interconnected subproblems that can individually be addressed in a reasonable timeframe. However, the overlaps result in the need for many decisions to be unnecessarily recalculated, thus extending the solver’s time to reach an optimal solution.

The innovative, artificial intelligence-boosted method discerns which elements of each subproblem should remain constant, fixing those variables to prevent redundant calculations. Subsequently, a traditional algorithmic solver addresses the remaining variables.

“Frequently, a dedicated group could invest months or even years devising an algorithm to solve just one of these combinatorial dilemmas. Contemporary deep learning provides us the chance to leverage new advancements to assist in simplifying the design of these algorithms. We can utilize what we already know works effectively, and incorporate AI to enhance it,” states Cathy Wu, the Thomas D. and Virginia W. Cabot Career Development Associate Professor in Civil and Environmental Engineering (CEE) and the Institute for Data, Systems, and Society (IDSS) at MIT, as well as a member of the Laboratory for Information and Decision Systems (LIDS).

She is accompanied on the paper by lead author Sirui Li, an IDSS graduate scholar; Wenbin Ouyang, a CEE graduate scholar; and Yining Ma, a LIDS postdoctoral researcher. The findings will be presented at the International Conference on Learning Representations.

Removing redundancy

A key motivation for this research stems from a practical issue recognized by a master’s student Devin Camille Wilkins in Wu’s introductory transportation course. The student aspired to apply reinforcement learning to an actual train-dispatch situation at Boston’s North Station. The transit authority needs to assign numerous trains to a finite number of platforms where they can be efficiently rerouted well before their arrival at the station.

This proves to be a highly complex combinatorial scheduling issue — precisely the type of challenge that Wu’s lab has been addressing over the past few years.

When confronted with a long-range challenge that involves allocating a restricted set of resources, such as factory tasks, to a collection of machines, planners typically frame the issue as Flexible Job Shop Scheduling.

In Flexible Job Shop Scheduling, each task requires a different amount of time for completion, but tasks may be allocated to any machine. Simultaneously, each task consists of operations that must be carried out in the correct sequence.

Such challenges rapidly escalate to sizes that are too large and cumbersome for traditional solvers, prompting users to employ rolling horizon optimization (RHO) to divide the challenge into manageable sections that can be resolved more quickly.

Using RHO, a user designates an initial set of tasks to machines within a defined planning horizon, perhaps a four-hour window. Afterward, they execute the first task in that series and advance the four-hour planning horizon to incorporate the next task, continuing this procedure until the entire challenge is solved and the final schedule of task-machine distributions is generated.

A planning horizon should exceed the duration of any single task, as the solution becomes more effective if the algorithm takes into account tasks that will be upcoming.

However, advancing the planning horizon creates some overlaps with operations in the preceding planning horizon. The algorithm has already devised preliminary solutions for these overlapping operations.

“Perhaps these preliminary solutions are acceptable and don’t warrant recomputation, but it’s also possible they aren’t sufficient. This is where machine learning plays a role,” Wu clarifies.

For their methodology, termed learning-guided rolling horizon optimization (L-RHO), the researchers train a machine-learning model to predict which operations, or variables, should be recalibrated when the planning horizon progresses.

L-RHO necessitates data to train the model, leading the researchers to resolve a cluster of subproblems using a classical algorithmic solver. They selected the optimal solutions — the ones containing the most operations that do not require recomputation — and utilized these as training data.

Once the model is trained, it is presented with a new subproblem it has not encountered before and forecasts which operations should not undergo recomputation. The remaining operations are then returned to the algorithmic solver, which performs the task, recalculates these operations, and advances the planning horizon. The cycle then recommences.

“In hindsight, if we didn’t need to reoptimize them, we can exclude those variables from the problem. Since these issues expand exponentially in complexity, it can be significantly advantageous if we can eliminate some of those variables,” she adds.

An adaptable, scalable method

To evaluate their approach, the researchers compared L-RHO against several foundational algorithmic solvers, specialized solvers, and methods that rely solely on machine learning. It surpassed them all, cutting solve time by 54 percent and enhancing solution quality by up to 21 percent.

Furthermore, their technique consistently outperformed all baseline methods when tested on more intricate variations of the issue, such as when factory machines malfunction or when there is increased train congestion. It even excelled against additional baselines that the researchers developed to test their solver.

“Our technique can be implemented without alteration across all these various variants, which is precisely what we aimed to achieve with this line of investigation,” she remarks.

L-RHO can also adjust if the objectives shift, automatically generating a new algorithm to resolve the problem — all it requires is a new training dataset.

In the future, the researchers seek to gain a deeper understanding of the reasoning behind their model’s choice to freeze certain variables but not others. They also aspire to integrate their approach into other forms of complex optimization challenges like inventory management or vehicle routing.

This research was supported in part by the National Science Foundation, MIT’s Research Support Committee, an Amazon Robotics PhD Fellowship, and MathWorks.


Leave a Reply

Your email address will not be published. Required fields are marked *

Share This