Scientific seminars given by experts working in industrial sectors working on topics such as biomedical imaging, cultural heritage, HPC, medical imaging.
Visual Autoregressive (VAR) models are emerging as a powerful paradigm for generative computer vision, offering scalability and quality comparable to Diffusion models. However, deploying these large-scale architectures on industrial platforms is hindered by their prohibitive memory footprint and significant computational constraints. This seminar explores the theoretical and practical challenges of compressing autoregressive models for constrained hardware, like consumer GPUs or even embedded platforms (e.g., NVIDIA Jetson). We will analyze the statistical properties of these networks, focusing on activation outliers and channel variance that render standard weight-activation quantization techniques ineffective. We will then combine advanced algebraic approaches—such as Singular Value Decomposition (SVD)—to structurally decouple high-rank outliers from weight matrices, enabling efficient low-bit representation without retraining. The talk aims to bridge the gap between mathematical optimization and real-world deployment, showing how generative AI can be enabled on the edge.
In this talk, we will showcase a selection of industrial projects developed by MOXOFF Srl (www.moxoff.com), highlighting how integrating images from different sources has been instrumental in transforming complex data into impactful, real-world applications. Following a brief introduction to the company by one of its co-founders, we will dive into several compelling use cases that illustrate the methodological approach, technological challenges, and tangible results.
The evolution of Machine Learning has radically transformed Computer Vision, shifting the focus from simple pattern recognition to deep semantic understanding and the generation of complex visual content. This seminar explores the methodologies and architectures that currently define the state of the art in the field, analyzing how the integration of statistical models and optimization techniques drives the performance of modern systems.
Starting from the research experience of the AImageLab at Unimore, the most recent paradigms will be discussed: from Vision Transformers and multimodal analysis (Vision-and-Language) to the new frontiers of generative models (Diffusion Models) and neural 3D representations. The talk will offer an overview of current engineering challenges, showing how the design of increasingly efficient and versatile models is redefining the boundaries of image analysis in domains ranging from human-computer interaction to the automotive industry.
Deep learning-based methods have revolutionized the field of imaging inverse problems, yielding state-of-the-art performance across various imaging domains. The best performing networks incorporate the imaging operator within the network architecture, typically in the form of deep unrolling. However, in large-scale problems, such as 3D imaging, most existing methods fail to incorporate the operator in the architecture due to the prohibitive amount of memory required by global forward operators, which hinder typical patching strategies. In this seminar, I will present a domain partitioning strategy and normal operator approximations that enable the training of end-to-end reconstruction models incorporating forward operators of arbitrarily large problems into their architecture. The proposed method achieves state-of-the-art performance on 3D X-ray cone-beam tomography and 3D multi-coil accelerated MRI, while requiring only a single GPU for both training and inference.