SAS Institute is a developer of analytics software based in Cary, North Carolina. SAS develops and markets a suite of analytics software , which helps manage, access, analyze and report on data to aid in decision-making. The company is the world's largest privately held software business and its software is used by most of the Fortune 500.SAS has developed a model workplace environment and benefits program designed to retain employees, allow them to focus on their work, and reduce operating costs. Professor Jeffrey Pfeffer from the Stanford Graduate School of Business estimated that the company saves $60–$80 million annually in expenses related to employee turnover. It provides on-site, subsidized or free healthcare, gyms, daycare and life counseling services.SAS Institute started as a project at North Carolina State University to create a "statistical analysis software" that was originally used primarily by agricultural departments at universities in the late 1960s. It became an independent, private business led by current CEO James Goodnight and three other project leaders from the university in 1976. SAS grew from $10 million in revenues in 1980 to $1.1 billion by 2000. A larger proportion of these revenues are spent on research and development than at most other software companies, at one point more than double the industry average. Wikipedia.
Selukar R.,SAS Institute
Journal of Statistical Software | Year: 2011
This article provides a brief introduction to the state space modeling capabilities in SAS, a well-known statistical software system. SAS provides state space modeling in a few different settings. SAS/ETS, the econometric and time series analysis module of the SAS system, contains many procedures that use state space models to analyze univariate and multivariate time series data. In addition, SAS/IML, an interactive matrix language in the SAS system, provides Kalman filtering and smoothing routines for stationary and nonstationary state space models. SAS/IML also provides support for linear algebra and nonlinear function optimization, which makes it a convenient environment for general-purpose state space modeling. Source
SAS Institute | Date: 2015-10-28
A computing device presents a cluster visualization based on a neural network computation. First centroid locations are computed for first clusters. Second centroid locations are computed for second clusters. Each centroid location includes a plurality of coordinate values where each coordinate value relates to a single variable of a plurality of variables. Distances are computed pairwise between each centroid location. An optimum pairing is selected based on a minimum distance of the computed pairwise distances where each pair is associated with a different cluster of a set of composite clusters. Noised centroid location data is created. A multi-layer neural network is trained with the noised centroid location data. A projected centroid location is determined in a multidimensional space for each centroid location as values of hidden units of a middle layer of the multi-layer neural network. A graph is presented for display that indicates the determined, projected centroid locations.
SAS Institute | Date: 2015-10-30
A computing device to select decorrelated variables using a graph based method is provided. A correlation value is computed between each pair of a plurality of variables to define a correlation matrix. A binary threshold value is compared to each correlation value to define a binary similarity matrix from the correlation matrix. An undirected graph comprising a subgraph that includes one or more connected nodes is defined based on the binary similarity matrix to store connectivity information for the plurality of variables. Each node of the subgraph is pairwise associated with a unique variable of the variables. (a) A least connected node is selected from the undirected graph based on the connectivity information. (b) The selected least connected node is removed from the undirected graph. (c) The connectivity information for the undirected graph is updated based on the removed node. (d) (a)-(c) are repeated until a stop criterion is satisfied.
SAS Institute | Date: 2015-07-02
Techniques for providing interactive decision trees are included. For example, a system is provided that stores data related to a decision tree, wherein the data includes one or more data structures and one or more portions of code. The system receives input corresponding to an interaction request associated with a modification to the decision tree. The system determines whether the modification requires multiple-processing iterations of the distributed data set. The system generates an application layer modified decision tree when the generating requires no multiple-processing iterations of the distributed data set. The system facilitates server layer modification of the decision tree when the modification requires multiple-processing iterations of the distributed data set. The system generates a representation of the application layer modified decision tree or the server layer modified decision tree.
SAS Institute | Date: 2015-10-28
A computing device to compute clusters using random subsets of variables is provided. Each data point of a plurality of data points is associated with a variable to define a plurality of variables. A subset of the plurality of variables is randomly selected. The subset does not include all of the plurality of variables. A number of clusters into which to segment the received data is determined. Cluster data that defines each cluster of the determined number of clusters is determined by executing a clustering algorithm with the received data using only the plurality of data points defined for each observation that are associated with the randomly selected subset of the plurality of variables. The determined cluster data is stored to cluster second data into the determined number of clusters. The second data is different from the received data.