Subscription with
Open Access Option
submit CrossRef Open Access Subscribe New Journal Ideal

Click on image to enlarge

Indexed in the SCIE (2017 Impact Factor
0.311), and in Scopus

Journal of Web Engineering

Editors-in-Chief:
Martin Gaedke, Chemnitz University of Technology, Germany
Geert-Jan Houben, Delft University of Technology, The Netherlands
Bebo White, Stanford University, USA


ISSN: 1540-9589 (Print Version),

ISSN: 1544-5976 (Online Version)
Vol: 17   Issue: Combined Issue 1 & 2

Published In:   January 2018

Publication Frequency: 8 issues per year


Search Available Volume and Issue for Journal of Web Engineering


Journal Description        Read Full Articles        Editorial Board        Subscription        Indexed

A Framework for Product Description Classification in E-commerce


Damir Vandic1, Flavius Frasincar1 and Uzay Kaymak2

1Econometric Institute, Erasmus University Rotterdam P.O. Box 1738, 3000 DR Rotterdam, the Netherlands
2Department of Industrial Engineering & Innovation Sciences, Eindhoven University of Technology P.O. Box 513, 5600 MB Eindhoven, the Netherlands

Abstract: [+]    |    Download File [ 625KB ]

Abstract: We propose the Hierarchical Product Classi cation (HPC) framework for the purpose of classifying products using a hierarchical product taxonomy. The framework uses a classi cation system with multiple classi cation nodes, each residing on a di erent level of the taxonomy. The innovative part of the framework stems from the de nition of classi cation recipes that can be used to construct high-quality classi er nodes, using the product descriptions in the most optimal way. These classi er recipes are speci cally tailored for the e-commerce domain. The use of these classi er recipes enables exible classi ers that adjust to the taxonomy depth-speci c characteristics of product taxonomies. Furthermore, in order to gain insight into which components are required to perform high quality product classi cation, we evaluate several feature selection methods and classi cation techniques in the context of our framework. Based on 3000 product descriptions obtained from Amazon.com, HPC achieves an overall accuracy of 76.80% for product classifcation. Using 110 categories from CircuitCity.com and Amazon.com, we obtain a precision of 93.61% for mapping the categories to the taxonomy of shopping.com.

Keywords: Product descriptions, hierarchical clustering, feature selection, e-commerce

Text-Mining and Pattern-Matching based Prediction Models for Detecting Vulnerable Files in Web Applications


Mukesh Kumar Gupta1, Mahesh Chandra Govil2 and Girdhari Singh3

1Department of Computer Science and Engineering Malaviya National Institute of Technology, Jaipur, Rajasthan, India
2Department of Computer Science and Engineering Malaviya National Institute of Technology, Jaipur, Rajasthan, India
3Department of Computer Science and Engineering Malaviya National Institute of Technology, Jaipur, Rajasthan, India

Abstract: [+]    |    Download File [ 539KB ]

Abstract: The proliferation of technology has empowered the web applications. At the same time, the presences of Cross-Site Scripting (XSS) vulnerabilities in web applications have become a major concern for all. Despite the many current detection and prevention approaches, attackers are exploiting XSS vulnerabilities continuously and causing significant harm to the web users. In this paper, we formulate the detection of XSS vulnerabilities as a prediction model based classification problem. A novel approach based on text-mining and pattern-matching techniques is proposed to extract a set of features from source code files. The extracted features are used to build prediction models, which can discriminate the vulnerable code files from the benign ones. The efficiency of the developed models is evaluated on a publicly available labeled dataset that contains 9408 PHP labeled (i.e. safe, unsafe) source code files. The experimental results depict the superiority of the proposed approach over existing ones

Keywords: Cross-Site Scripting vulnerability, Web Security, Vulnerability Detection, Machine Learning.

A Quantitative Analysis of the Use of Microdata for Semantic Annotations on Educational Resources


Rosa Del Carmen Mavarrete Rueda1 and Sergio Lujan2

1Escuela Politécnica Nacional, Ecuador
2University of Alicante, Spain

Abstract: [+]    |    Download File [ 1286KB ]

Abstract: A current trend in the semantic web is the use of embedded markup formats aimed to semantically enrich web content by making it more understandable to search engines and other applications. The deployment of Microdata as a markup format has increased thanks to the widespread of a controlled vocabulary provided by Schema.org. Recently, a set of properties from the Learning Resource Metadata Initiative (LRMI) specification, which describes educational resources, was adopted by Schema.org. These properties, in addition to those related to accessibility and the license of resources included in Schema.org, would enable search engines to provide more relevant results in searching for educational resources for all users, including users with disabilities. In order to obtain a reliable evaluation of the use of Microdata properties related to the LRMI specification, accessibility, and the license of resources, this research conducted a quantitative analysis of the deployment of these properties in large-scale web corpora covering two consecutive years. The corpora contain hundreds of millions of web pages. The results further our understanding of this deployment in addition to highlighting the pending issues and challenges concerning the use of such properties.

Keywords: semantic web, Microdata, educational resources, Schema.org, LRMI, educational resources, web standards

Semantic Emotion-Topic Model Based Social Emotion Mining


Ruirong Xue1, Subin Huang2, Xiangfeng Luo3, Dandan Jiang4, Yike Guo5 and Yan Peng6

1School of Computer Engineering and Science, Shanghai University, China
2School of Computer Engineering and Science, Shanghai University, China and Anhui Polytechnic University, China
3Shanghai Institute for Advanced Communication and Data Science, School of Computer Engineering and Science, Shanghai University, China
4School of Computer Engineering and Science, Shanghai University, China
5School of Computer Engineering and Science, Shanghai University, China and Department of Computing, Imperial College London, British
6School of Mechatronic Engineering and Automation, Shanghai University, China

Abstract: [+]    |    Download File [ 1621KB ]

Abstract: With the booming of social media users, more and more short texts with emotion labels appear in social media environment, which contain users' rich emotions and opinions about social events or enterprise products. Social emotion mining on social media corpus can help government or enterprise make their decisions. Emotion mining models involve statistical-based and graph-based approaches. Among them, the former approaches are more popular, e.g. Latent Dirichlet Allocation (LDA)-based Emotion Topic Model. However, they are suffering from bad retrieval performance, such as the bad accuracy and the poor interpretability, due to them only considering the bag-of-words or the emotion labels in social media media environment. In this paper, we propose a LDA-based Semantic Emotion-Topic Model (SETM) combining emotion labels and inter-word relations to enhance the retrieval performance in social media media environment. The performance influence of four factors on SETM are considered, i.e., association relations, computing time, topic number and semantic interpretability. Experimental results show that the accuracy of our proposed model is 0.750, compared with 0.606, 0.663 and 0.680 of Emotion Topic Model (ETM), Multilabel Supervised Topic Model (MSTM) and Sentiment Latent Topic Model (SLTM) respectively. Besides, the computing time of our model is reduced by 87.81% through limiting word frequency, and its accuracy is 0.703, compared with 0.501, 0.648 and 0.642 of the above baseline methods. Thus, the proposed model has broad prospects in social media media environment.

Keywords: Social emotion mining; Semantic discovery; Social emotion classification; Topic Model; Semantic Emotion Topic Model

Unsupervised Keyword Extraction from Microblog Posts via Hashtags


Lin Li1,2, Jinghang Liu1, Yueqing Sun1, Guangdong Xu3, Jingling Yuan1 and Luo Zhong1

11School of Computer Science & Technology, Wuhan University of Technology, Wuhan, 430070, China
2Hubei Key Laboratory of Transportation Internet of Things, Wuhan University of Technology, Wuhan, 430070, China
3Advanced Analytics Institute, University of Technology, Sydney, NSW 2007, Australia

Abstract: [+]    |    Download File [ 1242KB ]

Abstract: Nowadays, huge amounts of texts are being generated for social networking purposes on Web. Keyword extraction from such texts like microblog posts bene ts many applica- tions such as advertising, search, and content ltering. Unlike traditional web pages, a microblog post usually has some special social feature like a hashtag that is topical in nature and generated by users. Extracting keywords related to hashtags can re ect the intents of users and thus provides us better understanding on post content. In this paper, we propose a novel unsupervised keyword extraction approach for microblog posts by treating hashtags as topical indicators. Our approach consists of two hashtag enhanced algorithms. One is a topic model algorithm that infers topic distributions biased to hashtags on a collection of microblog posts. The words are ranked by their average topic probabilities. Our topic model algorithm can not only nd the topics of a collection, but also extract hashtag-related keywords. The other is a random walk based algorithm. It rst builds a word-post weighted graph by taking into account posts themselves. Then, a hashtag biased random walk is applied on this graph, which guides the algorithm to extract keywords according to hashtag topics. Last, the nal ranking score of a word is determined by the stationary probability after a number of iterations. We evaluate our proposed approach on a collection of real Chinese microblog posts. Experiments show that our approach is more e ective in terms of precision than traditional approaches con- sidering no hashtag. The result achieved by the combination of two algorithms performs even better than each individual algorithm.

Keywords: Keyword Extraction, Microblog Post, Hashtag, Topic Model, Random Walk

A Graph Based Technique of Process Partitioning


Gang Xue, Jing Liu, Liwen Wu, and Shaowen Yao

School of Software, Yunnan University, Kunming, Yunnan, China

Abstract: [+]    |    Download File [ 1567KB ]

Abstract: Web service is an important technology for constructing distributed applications. In order to provide more complex functionalities, services can be reused by applying service composition. A service composition can be designed and implemented through a centralization or decentralization strategy. When observing the decentralized service composition, several researchers found out that this kind of compositions has its own advantages. These findings promote the development of approaches for designing, implementing and applying decentralized service compositions. Process partitioning is a topic about dividing a process into a collection of small parts. The technique is applicable to partitioning a process in a centralized service composition, and the result can provide support to constructing a decentralized service composition. This paper presents a technique of process partitioning. The technique can be used for constructing decentralized service compositions, and it provides a graph transformation based approach to reorganizing a process which is represented as a process structure graph. Compared to existing approaches, the technique can partition well-structured and unstructured processes. Some issues about decentralized service compositions and performance tests of service compositions are discussed in this paper. Experimental results show that, when compared with the centralized service composition, the decentralized service composition can have lower average response time and higher throughput in runtime environment.

Keywords: process partitioning, graph transformation based algorithm, typed directed graphs

River Publishers: Journal of Web Engineering