International Journal of Multimedia Information Retrieval

Papers
(The median citation count of International Journal of Multimedia Information Retrieval is 1. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2020-03-01 to 2024-03-01.)
ArticleCitations
A survey on instance segmentation: state of the art196
A review on deep learning in medical image analysis152
Anomaly detection using edge computing in video surveillance system: review41
Interactive video retrieval evaluation at a distance: comparing sixteen interactive video search systems in a remote setting at the 10th Video Browser Showdown36
Recent trends in image watermarking techniques for copyright protection: a survey36
Design ensemble deep learning model for pneumonia disease classification35
Music similarity measurement and recommendation system using convolutional neural networks30
Generative adversarial networks: a survey on applications and challenges29
Generative adversarial networks and its applications in the biomedical image segmentation: a comprehensive survey19
Recent advances in local feature detector and descriptor: a literature survey19
A study of classification and feature extraction techniques for brain tumor detection17
Optimized MobileNet + SSD: a real-time pedestrian detection on a low-end edge device17
Multi-sensor human activity recognition using CNN and GRU13
Text detection, recognition, and script identification in natural scene images: a Review13
Contrastive self-supervised learning: review, progress, challenges and future research directions10
An automatic approach of audio feature engineering for the extraction, analysis and selection of descriptors9
Multimodal news analytics using measures of cross-modal entity and context consistency8
Cluster-based quotas for fairness improvements in music recommendation systems8
AMS-CNN: Attentive multi-stream CNN for video-based crowd counting8
Content-based image retrieval using Group Normalized-Inception-Darknet-537
Multimodal Quasi-AutoRegression: forecasting the visual popularity of new fashion products7
Music emotion recognition based on segment-level two-stage learning6
A novel method for video shot boundary detection using CNN-LSTM approach6
Organ segmentation from computed tomography images using the 3D convolutional neural network: a systematic review6
Caption TLSTMs: combining transformer with LSTMs for image captioning6
Siamese coding network and pair similarity prediction for near-duplicate image detection6
Image annotation: the effects of content, lexicon and annotation method6
Reinforcement learning applied to machine vision: state of the art6
Different techniques for Alzheimer’s disease classification using brain images: a study5
PDS-Net: A novel point and depth-wise separable convolution for real-time object detection5
A literature review and perspectives in deepfakes: generation, detection, and applications5
InceptionDepth-wiseYOLOv2: improved implementation of YOLO framework for pedestrian detection5
Multi-class imbalanced image classification using conditioned GANs4
An improved customized CNN model for adaptive recognition of cerebral palsy people’s handwritten digits in assessment4
Alleviating the cold-start playlist continuation in music recommendation using latent semantic indexing3
Your heart rate betrays you: multimodal learning with spatio-temporal fusion networks for micro-expression recognition3
FCT: fusing CNN and transformer for scene classification3
Human pose estimation using deep learning: review, methodologies, progress and future research directions3
Cross-domain image retrieval: methods and applications3
Semantic-aware visual scene representation2
FDAM: full-dimension attention module for deep convolutional neural networks2
Generative adversarial networks for 2D-based CNN pose-invariant face recognition2
Enhancing the performance of 3D auto-correlation gradient features in depth action classification2
Deep learning for video-text retrieval: a review2
A comprehensive survey of multimodal fake news detection techniques: advances, challenges, and opportunities2
A fast and robust affine-invariant method for shape registration under partial occlusion2
Gender classification from face images using central difference convolutional networks2
End-to-end residual learning-based deep neural network model deployment for human activity recognition2
Optical music recognition for homophonic scores with neural networks and synthetic music generation2
A unified approach of detecting misleading images via tracing its instances on web and analyzing its past context for the verification of multimedia content2
Counterfactual attribute-based visual explanations for classification2
Special issue on deep learning in image and video retrieval2
A lightweight small object detection algorithm based on improved YOLOv5 for driving scenarios1
CLIP-based fusion-modal reconstructing hashing for large-scale unsupervised cross-modal retrieval1
Tri-RAT: optimizing the attention scores for image captioning1
A comprehensive survey on chest diseases analysis: technique, challenges and future research directions1
Maximizing mutual information inside intra- and inter-modality for audio-visual event retrieval1
MHA-WoML: Multi-head attention and Wasserstein-OT for few-shot learning1
Study of Alzheimer’s disease brain impairment and methods for its early diagnosis: a comprehensive survey1
Prototype local–global alignment network for image–text retrieval1
Medical image watermarking: a survey on applications, approach and performance requirement compliance1
Multimodal image and audio music transcription1
Special issue on cross-modal retrieval and analysis1
Few2Decide: towards a robust model via using few neuron connections to decide1
Neural style transfer generative adversarial network (NST-GAN) for facial expression recognition1
Early-stopped learning for action prediction in videos1
Content-based image retrieval using handcraft feature fusion in semantic pyramid1
A local representation-enhanced recurrent convolutional network for image captioning1
MemeTector: enforcing deep focus for meme detection1
Who is gambling? Finding cryptocurrency gamblers using multi-modal retrieval methods1
TCKGE: Transformers with contrastive learning for knowledge graph embedding1
LG-MLFormer: local and global MLP for image captioning1
RGBD deep multi-scale network for background subtraction1
Sentiment analysis using deep learning techniques: a comprehensive review1
Modal interaction-enhanced prompt learning by transformer decoder for vision-language models1
SPSD: Similarity-preserving self-distillation for video–text retrieval1
0.030426025390625