Machine Learning Research Group

Sydney, Australia

Machine Learning Research Group

Sydney, Australia

Time filter

Source Type

Quiterio T.M.,Federal University of São Paulo | Lorena A.C.,Machine Learning Research Group
Proceedings - 2016 5th Brazilian Conference on Intelligent Systems, BRACIS 2016 | Year: 2016

An usual strategy to solve multiclass classification problems in Machine Learning is to decompose them into multiple binary sub-problems. The final multiclass prediction is obtained by a proper combination of the outputs of the binary classifiers induced in their solution. Decision directed acyclic graphs (DDAG) can be used to organize and to aggregate the outputs of the pairwise classifiers from the one-versus-one (OVO) decomposition. Nonetheless, there are various possible DDAG structures for problems with many classes. In this paper evolutionary algorithms are employed to heuristically find the positions of the OVO binary classifiers in a DDAG. The objective is to place easier sub-problems at higher levels of the DDAG hierarchical structure, in order to minimize the occurrence of cumulative errors. For estimating the complexity of the binary sub-problems, we employ two indexes which measure the separability of the classes. The proposed approach presented sound results in a set of experiments on benchmark datasets, although random DDAGs also performed quite well. © 2016 IEEE.


Zhang X.,Machine Learning Research Group | Lee W.S.,National University of Singapore | The Y.W.,University of Oxford
Advances in Neural Information Processing Systems | Year: 2013

Incorporating invariance information is important for many learning problems. To exploit invariances, most existing methods resort to approximations that either lead to expensive optimization problems such as semi-definite programming, or rely on separation oracles to retain tractability. Some methods further limit the space of functions and settle for non-convex models. In this paper, we propose a framework for learning in reproducing kernel Hilbert spaces (RKHS) using local invariances that explicitly characterize the behavior of the target function around data instances. These invariances are compactly encoded as linear functionals whose value are penalized by some loss function. Based on a representer theorem that we establish, our formulation can be efficiently optimized via a convex program. For the representer theorem to hold, the linear functionals are required to be bounded in the RKHS, and we show that this is true for a variety of commonly used RKHS and invariances. Experiments on learning with unlabeled data and transform invariances show that the proposed method yields better or similar results compared with the state of the art.


Wu W.,University of Technology, Sydney | Li B.,University of Technology, Sydney | Li B.,Machine Learning Research Group | Chen L.,University of Technology, Sydney | Zhang C.,University of Technology, Sydney
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) | Year: 2016

Traditional cross-view information retrieval mainly rests on correlating two sets of features in different views. However, features in different views usually have different physical interpretations. It may be inappropriate to map multiple views of data onto a shared feature space and directly compare them. In this paper, we propose a simple yet effective Cross-View Feature Hashing (CVFH) algorithm via a “partition and match” approach. The feature space for each view is bi-partitioned multiple times using B hash functions and the resulting binary codes for all the views can thus be represented in a compatible B-bit Hamming space. To ensure that hashed feature space is effective for supporting generic machine learning and information retrieval functionalities, the hash functions are learned to satisfy two criteria: (1) the neighbors in the original feature spaces should be also close in the Hamming space; and (2) the binary codes for multiple views of the same sample should be similar in the shared Hamming space. We apply CVFH to cross view image retrieval. The experimental results show that CVFH can outperform the Canonical Component Analysis (CCA) based cross-view method. © Springer International Publishing Switzerland 2016.


Chen S.,University of New South Wales | Chen S.,Machine Learning Research Group | Epps J.,University of New South Wales | Epps J.,Machine Learning Research Group
IEEE Pervasive Computing | Year: 2013

Using cameras mounted near the eyes, the proposed system extracts information about blink patterns to estimate cognitive and perceptual loads and detect task transitions. Preliminary results pave the way for always-on wearable computing interfaces that understand the user's current task type, load, and transition. © 2002-2012 IEEE.


Khoa N.L.D.,Machine Learning Research Group | Zhang B.,Machine Learning Research Group | Wang Y.,Machine Learning Research Group | Chen F.,Machine Learning Research Group | Mustapha S.,Networks Research Group
Structural Health Monitoring | Year: 2014

Structural health monitoring has been increasingly used due to the advances in sensing technology and data analysis, facilitating the shift from time-based to condition-based maintenance. This work is part of the efforts which have applied structural health monitoring to the Sydney Harbour Bridge - one of Australia's iconic structures. It combines dimensionality reduction and pattern recognition techniques to accurately and efficiently distinguish faulty components from well-functioning ones. Specifically, random projection is used for dimensionality reduction on the vibration feature data. Then, healthy and damaged patterns of bridge components are learned in the lower dimensional projected space using supervised and unsupervised machine learning methods, namely, support vector machine and one-class support vector machine. The experimental results using data from a laboratory-based building structure and the Sydney Harbour Bridge showed high feasibility of applying machine learning techniques to dimensionality reduction and damage detection in structural health monitoring. Random projection combined with support vector machine significantly reduces the computational time while maintaining the detection accuracy. The proposed method also outperformed popular dimensionality reduction techniques. The computational time of the method using random projection can be more than 200 times faster than that without using dimensionality reduction while still achieving similar detection accuracy. © The Author(s) 2014.


Gould S.,Australian National University | Gould S.,Machine Learning Research Group | He X.,Computer Vision Research Group | He X.,Australian National University
Communications of the ACM | Year: 2014

Pixels labeled with a scene's semantics and geometry let computers describe what they see. © 2014 ACM.


Zhang D.,Orange Group | Sun L.,Orange Group | Li B.,Machine Learning Research Group | Chen C.,Chongqing University | And 3 more authors.
IEEE Transactions on Intelligent Transportation Systems | Year: 2015

Taxi service strategies, as the crowd intelligence of massive taxi drivers, are hidden in their historical time-stamped GPS traces. Mining GPS traces to understand the service strategies of skilled taxi drivers can benefit the drivers themselves, passengers, and city planners in a number of ways. This paper intends to uncover the efficient and inefficient taxi service strategies based on a large-scale GPS historical database of approximately 7600 taxis over one year in a city in China. First, we separate the GPS traces of individual taxi drivers and link them with the revenue generated. Second, we investigate the taxi service strategies from three perspectives, namely, passenger-searching strategies, passenger-delivery strategies, and service-region preference. Finally, we represent the taxi service strategies with a feature matrix and evaluate the correlation between service strategies and revenue, informing which strategies are efficient or inefficient. We predict the revenue of taxi drivers based on their strategies and achieve a prediction residual as less as 2.35 RMB/h, The currency unit in China; 1 RMB U.S. 0.17. © 2000-2011 IEEE.


Yao H.,University of Alberta | Szepesvari C.,University of Alberta | Pires B.A.,University of Alberta | Zhang X.,Machine Learning Research Group
IEEE SSCI 2014 - 2014 IEEE Symposium Series on Computational Intelligence - ADPRL 2014: 2014 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, Proceedings | Year: 2014

In this paper we introduce the concept of pseudo-MDPs to develop abstractions. Pseudo-MDPs relax the requirement that the transition kernel has to be a probability kernel. We show that the new framework captures many existing abstractions. We also introduce the concept of factored linear action models; a special case. Again, the relation of factored linear action models and existing works are discussed. We use the general framework to develop a theory for bounding the suboptimality of policies derived from pseudo-MDPs. Specializing the framework, we recover existing results. We give a leastsquares approach and a constrained optimization approach of learning the factored linear model as well as efficient computation methods. We demonstrate that the constrained optimization approach gives better performance than the least-squares approach with normalization. © 2014 IEEE.


Cheng H.,University of Alberta | Zhang X.,Machine Learning Research Group | Schuurmans D.,University of Alberta
Uncertainty in Artificial Intelligence - Proceedings of the 29th Conference, UAI 2013 | Year: 2013

Although many convex relaxations of clustering have been proposed in the past decade, current formulations remain restricted to spherical Gaussian or discriminative models and are susceptible to imbalanced clusters. To address these shortcomings, we propose a new class of convex relaxations that can be flexibly applied to more general forms of Bregman divergence clustering. By basing these new formulations on normalized equivalence relations we retain additional control on relaxation quality, which allows improvement in clustering quality. We furthermore develop optimization methods that improve scalability by exploiting recent implicit matrix norm methods. In practice, we find that the new formulations are able to efficiently produce tighter clusterings that improve the accuracy of state of the art methods.


Zhang X.,Machine Learning Research Group | Yu Y.,University of Alberta | Schuurmans D.,University of Alberta
Advances in Neural Information Processing Systems | Year: 2013

Structured sparse estimation has become an important technique in many areas of data analysis. Unfortunately, these estimators normally create computational difficulties that entail sophisticated algorithms. Our first contribution is to uncover a rich class of structured sparse regularizers whose polar operator can be evaluated efficiently. With such an operator, a simple conditional gradient method can then be developed that, when combined with smoothing and local optimization, significantly reduces training time vs. The state of the art. We also demonstrate a new reduction of polar to proximal maps that enables more efficient latent fused lasso.

Loading Machine Learning Research Group collaborators
Loading Machine Learning Research Group collaborators