SAS Institute is a developer of analytics software based in Cary, North Carolina. SAS develops and markets a suite of analytics software , which helps manage, access, analyze and report on data to aid in decision-making. The company is the world's largest privately held software business and its software is used by most of the Fortune 500.SAS has developed a model workplace environment and benefits program designed to retain employees, allow them to focus on their work, and reduce operating costs. Professor Jeffrey Pfeffer from the Stanford Graduate School of Business estimated that the company saves $60–$80 million annually in expenses related to employee turnover. It provides on-site, subsidized or free healthcare, gyms, daycare and life counseling services.SAS Institute started as a project at North Carolina State University to create a "statistical analysis software" that was originally used primarily by agricultural departments at universities in the late 1960s. It became an independent, private business led by current CEO James Goodnight and three other project leaders from the university in 1976. SAS grew from $10 million in revenues in 1980 to $1.1 billion by 2000. A larger proportion of these revenues are spent on research and development than at most other software companies, at one point more than double the industry average. Wikipedia.
SAS Institute | Date: 2017-01-18
In a computing device supporting a failover in an event stream processing (ESP) system, an event block object is received. A first status of the computing device as active or standby is determined. When the first status is active, a second status of the computing device as newly active or not newly active is determined. Newly active is determined when the computing device is switched from a standby to an active status. When the second status is newly active, a last published event block object identifier that uniquely identifies a last published event block object is determined. A next event block object is selected from a non-transitory computer-readable medium accessible by the computing device. The next event block object has an event block object identifier that is greater than the determined last published event block object identifier. The selected next event block object is published to an out-messaging network device.
SAS Institute | Date: 2016-09-06
Exemplary embodiments are generally directed to methods, mediums, and systems for correcting censored or constrained historical data with various possible types of computing devices, including cloud-based devices, personal computing devices, and edge-based devices. The corrected data may be used in forecasting, for example to forecast demand for a limited resource. In some embodiments, the data is modeled at a higher level of granularity than an individual record. The aggregated demand may then be pro-rated over a group of categories or users where a given category of users that might be small or nonexistent over a certain time frame may be better accommodated. Moreover, it may be easier or more efficient to make assumptions and employ computing resources at the aggregate level.
SAS Institute | Date: 2016-08-10
Information related to a time series can be predicted. For example, a repetitive characteristic of the time series can be determined by analyzing the time series for a pattern that repeats over a predetermined time period. An adjusted time series can be generated by removing the repetitive characteristic from the time series. An effect of a moving event on the adjusted time series can be determined. The moving event can occur on different dates for two or more consecutive years. A residual time series can be generated by removing the effect of the moving event from the adjusted time series. A base forecast that is independent of the repetitive characteristic and the effect of the moving event can be generated using the residual time series. A predictive forecast can be generated by including the repetitive characteristic and the effect of the moving event into the base forecast.
An apparatus comprising a processor component to: provide, to a control device, an indication of availability to perform a processing task with one or more data set portions as a node device; perform a processing task specified by the control device with the one or more data set portions; and request a pointer to a location at which to store the one or more data set portions as a data block within a data file. In response to the data set including partitioned data, for each data set portion, include a data sub-block size of the data set portion and a hashed identifier derived from a partition label of a partition in the request; receive, from the control device, the requested pointer to the location; and store each data set portion as a data sub-block within the data block starting at the location within the data file.
An apparatus comprising a processor component to: receive metadata of data organization within a data set; receive indications of which node devices will be storing the data set as multiple data blocks within a data file; and receive, from each node device, a pointer request to a location within the data file for storing a data set portion as a data block. In response to the data set including partitioned data, for each request for a pointer: determine the location within the data file; generate a map data map entry for the data block; generate therein a sub-block count of data sub-blocks within the data block; generate therein a sub-entry for each data sub-block including size and a hashed identifier derived from a partition label; and provide a pointer to the node device. In response to successful storage of all data blocks, store the map data in the data file.
SAS Institute | Date: 2016-11-15
A computing device sorts a plurality of data points in a first dimension. A first data point has a first value, a second data point has a second value, and a third data point has a third value defined in a second dimension. (a) The second value is compared to the first and third values. (b) When the second value is less than the first value and greater than the third value, or the second value is greater than the first value and less than the third value, the data point is deleted. (c) The first data point is defined as the second data point. (d) The second data point is defined as the third data point. (e) The third data point is defined as a next data point. (a)-(e) are repeated until each of the plurality of data points is defined as the third data point to define a plurality of sampled data points as remaining data points of the plurality of data points.
SAS Institute | Date: 2016-10-25
In a system automatically processing data from a first computing device for use on a second computing device, a registry file including a plurality of filename parameters is read. Each filename parameter identifies a matching filename pattern, an extract script indicator, and a read file indicator. The extract script indicator indicates an extract script for a file having a filename that matches the matching filename pattern. The read file indicator indicates how to read the file having the filename that matches the matching filename pattern. One parameter of the plurality of filename parameters is selected by matching a filename of a source file to the matching filename pattern of the one parameter. The associated extract script is selected and used to read data from the source file using the associated read file indicator and the read data is output to a different file and in a different format.
An apparatus includes a processor component caused to: retrieve metadata of organization of data within a data set, and map data of organization of data blocks within a data file; receive indications of which node devices are available to perform a processing task with a data set portion; and in response to the data set including partitioned data, compare the quantities of available node devices and of the node devices last involved in storing the data set. In response to a match, for each map data map entry: retrieve a hashed identifier for a data sub-block, and a size for each of the data sub-blocks within the corresponding data block; divide the hashed identifier by the quantity of available node devices; compare the modulo value to a designation assigned to each of the available node devices; and provide a pointer to the available node device assigned the matching designation.
SAS Institute | Date: 2016-11-02
Various embodiments are generally directed to an apparatus, method and other techniques for receiving a request to generate a bootable image in a cloud-based computing environment, creating a block storage volume in the cloud-based computing environment in response to receiving the request, the block storage volume having one or more partitions. Further, an apparatus, method and so forth may include installing software comprising one or more files in a file system on the block storage volume in the cloud-based computing environment, creating a snapshot of the file system including the software in the cloud-based computing environment, and creating a bootable image from the snapshot of the file system in the cloud-based computing environment.
SAS Institute | Date: 2016-09-23
Embodiments are generated directed to method, medium, and system including processing circuitry to generate records including randomly selected events for each of one or more subjects having one or more of the same category parameters as a subject of a particular event. The processing circuitry may also present, on a display device, a computer-generated model based on the records, the model having a decision tree data structure having decision tree nodes corresponding with historical events from the records, each of the decision tree nodes having an indication of a likelihood of occurrence for the particular event based on whether a corresponding history event of the decision tree node occurred or did not occur within a specific time period. Embodiments of the real-time distributed nature of the systems and processing discussed herein can solve big data analytics processing problems and facilitate data anomaly detection.