This paper develops a Deep Learning model with a novel data smoothing technique to predict fine-grained on-street parking violation rates in Thessaloniki, Greece, using indirect features like weather and time, achieving improved accuracy (MAE of 0.146) over baseline methods.
Supervised Learning, Deep Learning, Prediction, Data Augmentation, Time Series Data, Urban Systems
Thien Nhan Vo
Ho Chi Minh City University of Technology (HUTECH), Vietnam
Generated by grok-3
Background Problem
The paper addresses the pervasive issue of illegal parking in large cities, which disrupts on-street parking systems by providing inaccurate information about parking slot availability. This affects drivers’ ability to find parking spaces, especially in crowded urban centers, and undermines trust in such systems. Traditional solutions like street occupation sensors are costly and impractical for large-scale deployment. The key problem solved is predicting fine-grained parking violation rates across different sectors of a city using indirect data (e.g., weather, time, historical patterns) to enhance the accuracy of parking availability information without expensive infrastructure.
Method
The proposed method employs a Deep Learning (DL) model to predict parking violation rates for specific sectors in an on-street parking system. The core idea is to use indirect features—spatial, temporal, and environmental data—to infer illegal parking rates. Key steps include: 1) Encoding spatial data by representing sectors as vectors of distances from 19 predefined Points of Interest (PoIs) to capture urban traffic patterns; 2) Encoding temporal data (weekday, day, month) using sine-based functions to account for periodicity, alongside time slots and weather data (temperature, humidity) averaged over windows; 3) Applying a novel data augmentation and smoothing technique using Gaussian distribution to assign violation rates to nearby time slots, addressing sparse and noisy police scan data; 4) Utilizing a residual DL architecture with six hidden layers (512 to 32 neurons), ReLU activations, and a sigmoid output layer to predict violation rates bounded between 0.1 and 0.9, trained with the Adamax optimizer and learning rate decay. The method aims to provide actionable predictions for guiding drivers to sectors with likely available parking slots.
Experiment
The experiments were conducted using data from the THESi on-street parking system in Thessaloniki, Greece, covering 4700 parking slots across 396 sectors, with 3.8 million scans from 300,000 police checks and weather data from OpenWeather. The dataset was split into 80% training and 20% testing, with Mean Absolute Error (MAE) as the evaluation metric. The setup aimed to predict hourly violation rates per sector, justified by the need for fine-grained predictions during parking control hours (7:00 to 19:00). Results showed a baseline MAE of 0.175 without smoothing, improving to 0.169 on the raw test set and 0.146 on the smoothed test set with the proposed data smoothing technique, compared to a naive average predictor’s MAE of 0.251. Cross-validation confirmed smoother convergence and better MAE with smoothed data. While the improvement is notable, the experimental design lacks comparisons with other methods (e.g., random forests from prior works) and does not address generalizability to other cities or robustness against overfitting. The results match the expectation of improved accuracy with smoothing, but the setup could be more comprehensive with additional benchmarks and sensitivity analyses.
Further Thoughts
The approach of using indirect data for parking violation prediction opens up intriguing possibilities for broader urban management applications, such as predicting traffic congestion or pedestrian flow using similar low-cost, data-driven methods. However, the heavy reliance on police scan data, which is inherently sparse and potentially biased (e.g., certain sectors might be scanned more frequently due to enforcement priorities), could skew the model’s understanding of violation patterns. A deeper exploration into integrating real-time crowd-sourced data or mobile app inputs from drivers could enhance robustness. Additionally, the PoI-based spatial encoding, while innovative, might not capture the full complexity of urban layouts—could graph-based representations, as hinted in the paper’s future work, better model sector interdependencies? Relating this to other domains, the data smoothing technique bears resemblance to methods used in time-series forecasting for financial markets, where missing data and noise are also prevalent; exploring cross-disciplinary techniques like Kalman filtering could offer new insights. Lastly, the impact of cultural or behavioral differences in parking habits across cities remains unaddressed—future work could investigate transfer learning to adapt the model to diverse urban contexts.