SAS Institute is a developer of analytics software based in Cary, North Carolina. SAS develops and markets a suite of analytics software , which helps manage, access, analyze and report on data to aid in decision-making. The company is the world's largest privately held software business and its software is used by most of the Fortune 500.SAS has developed a model workplace environment and benefits program designed to retain employees, allow them to focus on their work, and reduce operating costs. Professor Jeffrey Pfeffer from the Stanford Graduate School of Business estimated that the company saves $60–$80 million annually in expenses related to employee turnover. It provides on-site, subsidized or free healthcare, gyms, daycare and life counseling services.SAS Institute started as a project at North Carolina State University to create a "statistical analysis software" that was originally used primarily by agricultural departments at universities in the late 1960s. It became an independent, private business led by current CEO James Goodnight and three other project leaders from the university in 1976. SAS grew from $10 million in revenues in 1980 to $1.1 billion by 2000. A larger proportion of these revenues are spent on research and development than at most other software companies, at one point more than double the industry average. Wikipedia.
SAS Institute | Date: 2015-08-07
In a computing device supporting distributed stream processing, a request is received from a controller device to redistribute blocks storing streamed data. The request indicates that a number of blocks stored on the computing device be sent to a second computing device. The controller device controls distribution of analytic results to a data access system. The analytic results are computed from the streamed data. The indicated number of blocks are selected from the blocks storing the streamed data. The selected blocks are sent to the second computing device. Pointers to remaining blocks of the blocks storing the streamed data are updated.
SAS Institute | Date: 2015-04-24
A computer system where a geometric plot is generated having at least two axes, wherein a dataset from which the plot will be generated specifies at least one shape for the geometric plot and wherein the plot includes at least one axis having a plurality of discrete, categorical index values. Zero or more offset values are specified that determines a mapping of one or more shape-defining vertices of the at least one shape to a location that is a fractional distance between two of the discrete, categorical index values, such that a generated set of data specifies a pixel location for each of the shape-defining vertices of the at least one shape.
SAS Institute | Date: 2015-10-28
A computing device presents a cluster visualization based on a neural network computation. First centroid locations are computed for first clusters. Second centroid locations are computed for second clusters. Each centroid location includes a plurality of coordinate values where each coordinate value relates to a single variable of a plurality of variables. Distances are computed pairwise between each centroid location. An optimum pairing is selected based on a minimum distance of the computed pairwise distances where each pair is associated with a different cluster of a set of composite clusters. Noised centroid location data is created. A multi-layer neural network is trained with the noised centroid location data. A projected centroid location is determined in a multidimensional space for each centroid location as values of hidden units of a middle layer of the multi-layer neural network. A graph is presented for display that indicates the determined, projected centroid locations.
SAS Institute | Date: 2015-10-30
A computing device to select decorrelated variables using a graph based method is provided. A correlation value is computed between each pair of a plurality of variables to define a correlation matrix. A binary threshold value is compared to each correlation value to define a binary similarity matrix from the correlation matrix. An undirected graph comprising a subgraph that includes one or more connected nodes is defined based on the binary similarity matrix to store connectivity information for the plurality of variables. Each node of the subgraph is pairwise associated with a unique variable of the variables. (a) A least connected node is selected from the undirected graph based on the connectivity information. (b) The selected least connected node is removed from the undirected graph. (c) The connectivity information for the undirected graph is updated based on the removed node. (d) (a)-(c) are repeated until a stop criterion is satisfied.
SAS Institute | Date: 2015-07-02
Techniques for providing interactive decision trees are included. For example, a system is provided that stores data related to a decision tree, wherein the data includes one or more data structures and one or more portions of code. The system receives input corresponding to an interaction request associated with a modification to the decision tree. The system determines whether the modification requires multiple-processing iterations of the distributed data set. The system generates an application layer modified decision tree when the generating requires no multiple-processing iterations of the distributed data set. The system facilitates server layer modification of the decision tree when the modification requires multiple-processing iterations of the distributed data set. The system generates a representation of the application layer modified decision tree or the server layer modified decision tree.