Data Science Laboratory

Tokyo, Japan

Data Science Laboratory

Tokyo, Japan
SEARCH FILTERS
Time filter
Source Type

Lee Y.,Data Science Laboratory | Ryu H.,Data Science Laboratory | Lee H.,Data Science Laboratory
Proceedings of the International Conference on Industrial Engineering and Operations Management | Year: 2017

Investment strategy and predicting technique appeared to analyze pattern of stock market to get economic gain. However, predicting the flow of stock index is quite difficult because stock market contains uncertain factors. To overcome this problem, varieties of methodology is going along. And with 'big data', varieties of atypical data come out with social media. Therefore, in this research paper we predicted fluctuation of stock price with using 'News data'. We used morpheme analysis and sentimental analysis to make digitalize it. Next with this data we applied machine learning and made predicting model. Finally, we got prediction rate and F1 score. © IEOM Society International.


Min H.,Data Science Laboratory | Shin J.,Data Science Laboratory | Choi J.,Data Science Laboratory | Lee H.,Data Science Laboratory
Proceedings of the International Conference on Industrial Engineering and Operations Management | Year: 2017

Due to drastic development in computing environment Computational Fluid Dynamics (CFD) area can be simply modeled and simulated for better understanding of complex systems. So it enables many analysis of complex flux. In this research induce Navier-Stokes equation's mathematical, Fluid dynamical content and examine using method in CFD and compared result of CFD and result of coding. lastly induce governed equation at model's conditions. © IEOM Society International.


Lee J.,Data Science Laboratory | Ji M.,Data Science Laboratory | Lee H.,Data Science Laboratory
Proceedings of the International Conference on Industrial Engineering and Operations Management | Year: 2016

The corporate goal of financial management is to maximize its profit, and investment by using financial management minimizes the risk of the investment. It is very critical for companies to find a way to maximize the benefits of the company. Companies in the economic situation need to improve their future value by making an investment. This paper aims to determine the optimal investment plan by using the mathematical programming method such as linear programming in investment. In this paper, the concept of maximization of profit is defined as the concept of financial management. In addition, it evaluates the direct investment plan for analysis with expected return and risk. The concept of investment decisions is also discussed. Finally, by means of linear programming in mathematical programming, it mathematically defines the investment plan, which is directly modeled to derive the optimal investment plan. © IEOM Society International. © IEOM Society International.


Cho H.,Data Science Laboratory | Han Y.,Data Science Laboratory | Lee H.,Data Science Laboratory
Proceedings of the International Conference on Industrial Engineering and Operations Management | Year: 2016

India has one of the fastest growing stock market. However it seems that research of India stock market is woefully deficient. This paper develops the investment based on Markowitz's Portfolio Selection Theory using India historical stock return data. The entire experiment period holds nine years starting from the opening day in 2006 to the 2014 closing day. The research benchmarks Indian SENSEX of BSE (Bombay Stock Exchange). This process made comparison analysis of rate of SENSEX change, and rate of portfolio return. The investment category was chosen by top 30 on SENSEX market as of June 6, 2015, except in the case of five categories which lacks data. The portfolio was composed of eight weeks of investment period and eight weeks of rebalancing cycle. At this time the result displayed that rebalancing cycle influences the rate of return. Four weeks of rebalancing cycle performed outstanding return other than the eight and twelve weeks and rate of changes in SENSEX. In addition, this paper compares return on risk rate, also known as Sharpe ratio which measures portfolio performance. © IEOM Society International. © IEOM Society International.


Kim D.,Data Science Laboratory | Lee H.,Data Science Laboratory
Proceedings of the International Conference on Industrial Engineering and Operations Management | Year: 2016

To prevent crimes, understanding the space structure of society is very important because crime is a serious social problem. This research analyzes SA and LISA which is the spatial autocorrelation analysis, and they are considered the factor of space based on the five major crime occurrence data of Seoul from 2011 to 2013. The result could identify the spatial dependence and figure out the hot-spot, cold-spot and special outlier. This research shows the flow of result by year interpreted from the result of LISA with group number, type of crime and area. First, group number standard is that the number of hot-spot and cold-spot decreased and spatial dependence decreased either. Second, type of crime could find meaningful characteristics in theft, murder and robbery. Third, area pattern is that Songpa-gu showed hot-spot and Nowon-gu showed cold-spot for three years. Also, Seochogu showed hot-spot and LH and we could know that the crime rate decreased. Dongjak-gu and Yangcheon-gu mostly showed LH and this means that the crime rate decreased. This result can be used to prevent crimes which are centering hot-spot areas and considering the spatial dependence. © IEOM Society International. © IEOM Society International.


Liang S.,University College London | Ren Z.,Data Science Laboratory | Zhao Y.,Shandong University | Ma J.,Shandong University | And 2 more authors.
ACM Transactions on Information Systems | Year: 2017

User clustering has been studied from different angles. In order to identify shared interests, behaviorbased methods consider similar browsing or search patterns of users, whereas content-based methods use information from the contents of the documents visited by the users. So far, content-based user clustering has mostly focused on static sets of relatively long documents. Given the dynamic nature of social media, there is a need to dynamically cluster users in the context of streams of short texts. User clustering in this setting is more challenging than in the case of long documents, as it is difficult to capture the users' dynamic topic distributions in sparse data settings. To address this problem, we propose a dynamic user clustering topic model (UCT). UCT adaptively tracks changes of each user's time-varying topic distributions based both on the short texts the user posts during a given time period and on previously estimated distributions. To infer changes, we propose a Gibbs sampling algorithm where a set of word pairs from each user is constructed for sampling. UCT can be used in two ways: (1) as a short-term dependency model that infers a user's current topic distribution based on the user's topic distributions during the previous time period only, and (2) as a long-term dependency model that infers a user's current topic distributions based on the user's topic distributions during multiple time periods in the past. The clustering results are explainable and humanunderstandable, in contrast to many other clustering algorithms. For evaluation purposes, we work with a dataset consisting of users and tweets from each user. Experimental results demonstrate the effectiveness of our proposed short-term and long-term dependency user clustering models compared to state-of-the-art baselines. © 2017 ACM.


Nagamatsu G.,Keio University | Nagamatsu G.,Japan Science and Technology Agency | Nagamatsu G.,Kyushu University | Saito S.,Data Science Laboratory | And 5 more authors.
Stem Cell Reports | Year: 2015

Primordial germ cells (PGCs) are lineage-restricted unipotent cells that can dedifferentiate into pluripotent embryonic germ cells (EGCs). Here we performed whole-transcriptome analysis during the conversion of PGCs into EGCs, a process by which cells acquire pluripotency. To examine the molecular mechanism underlying this conversion, we focused on Blimp-1 and Akt, which are involved in PGC specification and dedifferentiation, respectively. Blimp-1 overexpression in embryonic stem cells suppressed the expression of downstream targets of the pluripotency network. Conversely, Blimp-1 deletion in PGCs accelerated their dedifferentiation into pluripotent EGCs, illustrating that Blimp-1 is a pluripotency gatekeeper protein in PGCs. AKT signaling showed a synergistic effect with basic fibroblast growth factor plus 2i+A83 treatment on EGC formation. AKT played a major role in suppressing genes regulated by MBD3. From these results, we defined the distinct functions of Blimp-1 and Akt and provided mechanistic insights into the acquisition of pluripotency in PGCs. © 2015 The Authors.


PubMed | Japan Science and Technology Agency, National University of Singapore, Data Science Laboratory and Keio University
Type: Journal Article | Journal: Stem cell reports | Year: 2015

Primordial germ cells (PGCs) are lineage-restricted unipotent cells that can dedifferentiate into pluripotent embryonic germ cells (EGCs). Here we performed whole-transcriptome analysis during the conversion of PGCs into EGCs, a process by which cells acquire pluripotency. To examine the molecular mechanism underlying this conversion, we focused on Blimp-1 and Akt, which are involved in PGC specification and dedifferentiation, respectively. Blimp-1 overexpression in embryonic stem cells suppressed the expression of downstream targets of the pluripotency network. Conversely, Blimp-1 deletion in PGCs accelerated their dedifferentiation into pluripotent EGCs, illustrating that Blimp-1 is a pluripotency gatekeeper protein in PGCs. AKT signaling showed a synergistic effect with basic fibroblast growth factor plus 2i+A83 treatment on EGC formation. AKT played a major role in suppressing genes regulated by MBD3. From these results, we defined the distinct functions of Blimp-1 and Akt and provided mechanistic insights into the acquisition of pluripotency in PGCs.


News Article | November 9, 2015
Site: www.scientificcomputing.com

The 1999 Odisha Cyclone struck the eastern coast of India, knocking out whole swaths of the Indian Rail­ways Network, bringing the eastern IRN system to a halt. Cyclones Hudhud and Phailin caused similar mayhem in 2014 and 2013, while in 2012 power blackouts in northern and eastern India idled 300 intercity passenger trains and commuter lines. Closer to home, severe winter storms that hit Boston in 2014 to 2015 brought the MBTA mass-transit system to its knees. Here and abroad, there is an urgent need for systematic strategies for recovering critical lifelines once disasters strike. Thanks to North­eastern researchers, that need is being met. First-year graduate student Udit Bhatia, under the direction of Auroop R. Ganguly, associate professor in the Department of Civil and Environmental Engineering, has drawn on network science to develop a computerized tool for guiding stakeholders in the recovery of large-scale infrastructure systems. In addi­tion to the IRN and MBTA, the method can be extended to water-distribution systems, power grids, com­munication networks, and even natural ecological systems. This unique tool, which has been filed for invention protection through Northeastern University’s Center for Research Innovation, also informs development of preventative measures for limiting damage in the face of a disaster. The study — which Bhatia and Ganguly coauthored with Devashish Kumar, PhD’16, and Evan Kodra, PhD’14 — appears in the November 4, 2015, issue of the journal PLOS ONE. “The tool, based on a quantitative framework, identifies the order in which the stations need to be restored after full or partial destructions,” says Bhatia, PhD’18, who is a student in Northeastern’s Sus­tainability and Data Science Laboratory, directed by Ganguly. “We found that, generally, the stations between two important stops were most critical,” he says, alluding to the network science concept of “cen­trality measures,” which identify stations that enable a large number of station-pairs to be connected to one another. Bhatia credits Northeastern’s interdiscipl­nary engineering graduate program with opening his mind to the possibility of constructing the model. Through the program, he took courses with experts in a variety of fields. They include: “Critical Infrastruc­tures Resilience,” co-taught by Ganguly, an expert in climate, hydrology, and applied data sciences, and Stephen Flynn, a professor of political science and director of the Center for Resilience Studies and co-director of the George J. Kostas Research Institute for Homeland Security, and “Complex Net­works,” taught by Albert-László Barabási, Robert Gray Dodge Professor of Network Science. Insights from Jerome F. Hajjar, CDM Smith Professor and CEE Chair and an expert in structural engineering, also helped shape the model. “Structural engineers have typically focused on rebuilding large infrastructures from the bottom up, identi­fying individual components or small-scale infrastructure systems,” says Bhatia. For IRN, this meant targeting the busiest station to begin repairs. Bhatia’s paper — based on a mix of real-world metrics, resilience, civil engineering principles, and network science-based algorithms — provides what Ganguly calls “a generic and quantitative top-down approach.” A comprehensive strategy requires a blend of bottom-up and top-down approaches, says Ganguly. “If these nodes of the system go down, here is a timely, resource-efficient, and overall effective way to speed recovery.” “Auroop and Udit are devel­oping a system frame­work, which is a new approach for solving complex system problems,” says Jalal Mapar, Director of the Resilient Systems Division, Department of Homeland Security, Science & Technology Directorate. “This new approach is very important and answers many of the complex questions that we will be facing in the next five to 50 years. It will help us understand the inter­dependencies and cascading effects of our critical infrastructure, and help us as a nation to be better pre­pared, because we know what we are dealing with.” For the study, Bhatia mined open-source datasets on ticket-reservation Web sites to track the origins and destinations of trains running on the IRN — the world’s most traveled railway in terms of passenger kilo­meters per day. He then constructed a complex network, with the stations as nodes and the lines con­necting those nodes as the “edges,” or links, between them, and overlaid it on a geographical map of the country. Next, he applied natural and man-made disasters to the system, knocking out stations using net­work science-derived algorithms. “We considered real-life events that have brought down this network,” says Bhatia, ticking off the 2004 Indian Ocean Tsunami and the 2012 North Indian blackout due to a power grid failure, as well as a simu­lated cyber-physical attack, partially modeled after the November 2008 Mumbai terror attack. “We asked: Should this recovery be based on the number of trains each station handles, the number of connections each station has, the importance of the connections, where that station is located in the network, or something else?” The researchers developed additional algorithms “to assign priority to each station,” Bhatia says, indi­cating when it should be brought back online to produce the fastest recovery of the entire system. In the IRN study, “betweenness centrality” often came to the fore. Bhatia cautions, however, that a single metric or strategy does not apply in all circumstances; for example, if just part of a network is disrupted, a particular station with an outsize number of connections might take precedence as a starting point over a station situated between two important stops. “This model gives you the ability to say, “These are the most critical nodes in the network, which if they failed, would cause a domino effect in the case of a disruption — meaning a cascading failure when there’s a major shock,’” says Flynn, who recently testified before the U.S. House of Representatives on the prevention of and response to the arrival of a dirty bomb at a U.S. port. “‘So, that’s obviously where we should go first.’” If the Boston MBTA had this tool during last winter’s historic snowfall, he says, they would have known where to start to get the transit system back up and running. Moreover, Flynn says, the model gives decision-makers — urban planners, emergency managers, opera­tions personnel who run the system day-to-day — insight into how to design the most secure system upfront. “And then,” he says, “it enables them to prioritize where to put mitigation measures — resources, such as backup power, and other safeguards, including computer-security measures, to make the overall system better withstand the risk of disruption.”

Loading Data Science Laboratory collaborators
Loading Data Science Laboratory collaborators