Logo

Publikacije (43)

Nazad

Conformers have shown great results in speech processing due to their ability to capture both local and global interactions. In this work, we utilize a self-supervised contrastive learning framework to train conformer-based encoders that are capable of generating unique embeddings for small segments of audio, generalizing well to previously unseen data. We achieve state-of-the-art results for audio retrieval tasks while using only 3 seconds of audio to generate embeddings. Our models are almost completely immune to temporal misalignments and achieve state-of-the-art results in cases of other audio distortions such as noise, reverb or extreme temporal stretching. Code and models are made publicly available and the results are easy to reproduce as we train and test using popular and freely available datasets of different sizes.

Sead Delalic, Rijad Mutapčić, Irhad Fejzić

The Vehicle Routing Problem (VRP) is among the most complex optimization problems. Practical solutions require addressing real-world constraints such as time windows, vehicle capacities, delivery restrictions, driver working hours, and heterogeneous vehicle fleets. Solutions are often implemented in two stages: the first involves clustering customers, while the second focuses on incremental routing of these clusters to reduce complexity and improve solution control and explainability. However, the second stage heavily depends on the quality of the first, and clustering methods vary depending on client requirements. This paper explores various clustering methods and their impact on the final routing results, with a focus on real-world examples. The study includes diverse client scenarios, ranging from small-scale distribution systems with a limited number of customers to large-scale operations managing more than thousand of deliveries daily, covering both small and large orders. From fixed clustering and geographic partitioning to dynamic clustering algorithms and hybrid approaches, the advantages and limitations of each method are analyzed. The findings aim to provide actionable insights into selecting clustering methods that align with specific use cases, ensuring enhanced efficiency and adaptability in practical applications.

Sead Delalic, Zinedin Kadric, Jana Jerkić, Faris Mehmedović

This paper addresses the challenge of analyzing CVs to parse their content into structured formats suitable for further processing and analysis. The proposed solution processes CVs provided as images or PDFs, handling diverse input formats, including free-form, multi-language, non-standardized layouts, and highly structured documents. Various heuristic approaches are employed for layout analysis, complemented by lightweight language models for extracting information. While multimodal models demonstrate strong performance, their cost and deployment complexity remain significant barriers. This study explores alternative methods optimized for computational efficiency, processing accuracy, and easier deployment. A comparative analysis of approaches is conducted on a standard dataset containing CVs from diverse clients and job roles, ranging from entry-level to specialized positions in various domains. The findings highlight the potential of these tailored, efficient solutions for scalable and secure CV parsing.

Sead Delalic, Samra Behić, Harun Goralija, Zenan Sabanac

Warehouse Management Systems (WMS) employ advanced optimization techniques to enhance efficiency and streamline processes, from inventory positioning to order picking and packing. Among these, order picking represents the most time-consuming and resourceintensive operation. This paper presents a novel approach for monitoring worker efficiency in warehouses, focusing on estimating the complexity and time required for order picking. A variety of factors influence these estimates, including item location, quantity, dimensions and weight of items, picking sequence, and whether the location is in the stock or picking zone. Accurate estimation enables effective daily work planning, real-time monitoring of worker productivity, and overall warehouse efficiency. The proposed approach has been tested in real-world warehouse environments, demonstrating its practical applicability and potential to significantly improve worker performance, resource allocation, and operational management.

Zlatan Ajanović, Hamza Merzi'c, Suad Krilasevi'c, Eldar Kurtic, Bakir Kudi'c, Rialda Spahi'c, E. Alickovic, Aida Brankovic et al.

In this paper, we analyze examples of research institutes that stand out in scientific excellence and social impact. We define key practices for evaluating research results, economic conditions, and the selection of specific research topics. Special focus is placed on small countries and the field of artificial intelligence. The aim is to identify components that enable institutes to achieve a high level of innovation, self-sustainability, and social benefits.

Sead Delalic, Zinedin Kadric, Elmedin Selmanovic, Emin Mulaimović, E. Kadušić

Deep learning techniques in computer vision (CV) tasks such as object detection, classification, and tracking can be facilitated using predefined markers on those objects. Selecting markers is an objective that can potentially affect the performance of the algorithms used for tracking as the algorithm might swap similar markers more frequently and, therefore, require more training data and training time. Still, the issue of marker selection has not been explored in the literature and seems to be glossed over throughout the process of designing CV solutions. This research considered the effects of symbol selection for 2D-printed markers on the neural network’s performance. The study assessed over 250 ALT code symbols readily available on most consumer PCs and provided a go-to selection for effectively tracking n-objects. To this end, a neural network was trained to classify all the symbols and their augmentations, after which the confusion matrix was analysed to extract the symbols that the network distinguished the most. The results showed that selecting symbols in this way performed better than the random selection and the selection of common symbols. Furthermore, the methodology presented in this paper can easily be applied to a different set of symbols and different neural network architectures.

Audio fingerprinting techniques have seen great advances in recent years, enabling accurate and fast audio retrieval even in conditions when the queried audio sample has been highly deteriorated or recorded in noisy conditions. Expectedly, most of the existing work is centered around music, with popular music identification services such as Apple’s Shazam or Google’s Now Playing designed for individual audio recognition on mobile devices. However, the spectral content of speech differs from that of music, necessitating modifications to current audio fingerprinting approaches. This paper offers fresh insights into adapting existing techniques to address the specialized challenge of speech retrieval in telecommunications and cloud communications platforms. The focus is on achieving rapid and accurate audio retrieval in batch processing instead of facilitating single requests, typically on a centralized server. Moreover, the paper demonstrates how this approach can be utilized to support audio clustering based on speech transcripts without undergoing actual speech-to-text conversion. This optimization enables significantly faster processing without the need for GPU computing, a requirement for real-time operation that is typically associated with state-of-the-art speech-to-text tools.

Elmedin Selmanovic, Emin Mulaimović, Sead Delalic, Zinedin Kadric, Zenan Sabanac

Many deep-learning computer vision systems analyse objects not previously observed by the system. However, such tasks can be simplified if the objects are marked beforehand. A straightforward method for marking is printing 2D symbols and attaching them to the objects. Selecting these symbols can affect the performance of the CV system, as similar symbols may require extended training time and a larger training dataset. It is possible to find good symbols differentiated by the given neural network easily. Still, there were no efforts to generalise such findings in the literature, and it is not known if the symbols optimal for one network would work just as well in the other. We explored how transferable symbol selection is between the networks. To this end, 30 sets of randomly selected and augmented symbols were classified by-five neural networks. Each network was given the same training dataset and the same training time. Results were ranked and compared, which allowed the identification of networks which performed similarly so that the symbol selection could be generalised between them.

In the field of telecommunications and cloud communications, accurately and in real-time detecting whether a human or an answering machine has answered an outbound call is of paramount importance. This problem is of particular significance during campaigns as it enhances service quality, efficiency and cost reduction through precise caller identification. Despite the significance of the field, it remains inadequately explored in the existing literature. This paper presents an innovative approach to answering machine detection that leverages transfer learning through the YAMNet model for feature extraction. The YAMNet architecture facilitates the training of a recurrent-based classifier, enabling real-time processing of audio streams, as opposed to fixed-length recordings. The results demonstrate an accuracy of over 96% on the test set. Furthermore, we conduct an in-depth analysis of misclassified samples and reveal that an accuracy exceeding 98% can be achieved with the integration of a silence detection algorithm, such as the one provided by FFmpeg.

Zlatan Ajanović, E. Alickovic, Aida Brankovic, Sead Delalic, Eldar Kurtic, S. Malikić, Adnan Mehonic, Hamza Merzic et al.

Artificial Intelligence (AI) is one of the most promising technologies of the 21. century, with an already noticeable impact on society and the economy. With this work, we provide a short overview of global trends, applications in industry and selected use-cases from our international experience and work in industry and academia. The goal is to present global and regional positive practices and provide an informed opinion on the realistic goals and opportunities for positioning B&H on the global AI scene.

The vehicle routing problem is one of the most complex problems in the field of combinatorial optimization. Creating optimal routes leads to timely delivery of orders to end customers, which increases the efficiency of the company and enables maximum earnings. The problem of vehicle routing with a series of real-world constraints is called the rich vehicle routing problem (RVRP). The paper presents an approach to solving RVRP, where the asymmetric routing problem with a heterogeneous vehicle fleet, time windows, customer-vehicle constraints and a number of others is observed. The approach solves the problem in two phases, by dividing customers into clusters using a discrete metaheuristic Bat algorithm, and by solving the routing problem for each obtained cluster. The proposed approach has been tested for 26 days of delivery from large warehouses in Bosnia and Herzegovina. Significant savings were achieved compared to previously implemented approaches. All created routes were feasible. The approach automatically creates routes, and gives results in a shorter time than previously used approaches. Time does not increase significantly with the increase in the number of customers, which is a great advantage of the proposed approach.

By successfully solving the problem of forecasting, the processes in the work of various companies are optimized and savings are achieved. In this process, the analysis of time series data is of particular importance. Since the creation of Facebook’s Prophet, and Amazon’s DeepAR+ and CNN-QR forecasting models, algorithms have attracted a great deal of attention. The paper presents the application and comparison of the above algorithms for sales forecasting in distribution companies. A detailed comparison of the performance of algorithms over real data with different lengths of sales history was made. The results show that Prophet gives better results for items with a longer history and frequent sales, while Amazon’s algorithms show superiority for items without a long history and items that are rarely sold.

Nema pronađenih rezultata, molimo da izmjenite uslove pretrage i pokušate ponovo!

Pretplatite se na novosti o BH Akademskom Imeniku

Ova stranica koristi kolačiće da bi vam pružila najbolje iskustvo

Saznaj više