If we are talking about splitting data, one of the proven method will be Hash, very efficient(one pass to the data if presorted, no need to know how many ‘types’ in advance) and fully dynamic.
input type x y;
1 12 13
1 14 15
1 12 13
2 14 25
2 15 26
2 26 23
3 25 26
3 26 35
3 36 38
declare hash h();
h.definedata(‘type’, ‘x’, ‘y’);
do _n_=1 by 1 until (last.type);
/* Create sample data */ data birth; input name $ bday :mmddyy10.; datalines; Miguel 12/31/1973 Joe 02/28/1976 Rutger 03/29/1976 Broguen 03/01/1976 Susan 12/12/1976 Michael 02/14/1971 LeCe 11/09/1967 Hans 07/02/1955 Lou 07/30/1960 ; /* Use the INTCK function to count the number of months between */ /* the date of birth and the current date. Divide the number of */ /* months by 12 to produce the number of years. Use the MONTH */ /* function to determine if the month of the birthday and the current */ /* date are the same. If they are, determine if the birthday has */ /* occurred this year. If it hasn't, adjust the age by subtracting */ /* one year. */ data ages; set birth; retain current; if _n_=1 then current=today(); format bday current worddate20.; age=int(intck('month',bday,current)/12); if month(bday)=month(current) then age=age-(day(bday)>day(current)); run; proc print; run;
SAS has the LAG function to look backwards as LinusH pointed out. However there is no looking forward function. But you can kind of simulate this using a second SET statement with the POINT= option so that you will read the I+1 observation, you will need to rename the variables from the second SET statement, so that you can compare. Taking your logic here is some sample code.
infile cards dlm=”,”;
Patient_ID : 8.
Admit_Date : anydtdtm.
Disch_Date : anydtdtm.
Facility_Type : $32.
format admit_date disch_date datetime19.;
obs1 = 1;
do while( obs1 <= nobs);
set have nobs=nobs;
obs2 = obs1 + 1;
Patient_ID = pId2
Admit_Date = ad2
Disch_Date = dd2
Facility_Type = ft2
If pId2 = Patient_ID and
ft2 = “Hospital” and
Facility_Type = “Nursing Home” and
datepart(ad2) >= datepart(Admit_Date) and
datepart(ad2) <= datepart(Disch_Date) Then
Flag = 1;
Else Flag = 0;
obs1 + 1;
drop obs1 obs2 pId2 ad2 dd2 ft2;
returns the SAS date value given the Julian date in yyddd or yyyyddd format. For example, DATE = DATEJUL(99001); assigns the SAS date value ’01JAN99’D to DATE, and DATE = DATEJUL(1999365); assigns the SAS date value ’31DEC1999’D to DATE.
returns a SAS date value for the holiday and year specified. Valid values for holiday are ‘BOXING’, ‘CANADA’, ‘CANADAOBSERVED’, ‘CHRISTMAS’, ‘COLUMBUS’, ‘EASTER’, ‘FATHERS’, ‘HALLOWEEN’, ‘LABOR’, ‘MLK’, ‘MEMORIAL’, ‘MOTHERS’, ‘NEWYEAR’,’THANKSGIVING’, ‘THANKSGIVINGCANADA’, ‘USINDEPENDENCE’, ‘USPRESIDENTS’, ‘VALENTINES’, ‘VETERANS’, ‘VETERANSUSG’, ‘VETERANSUSPS’, and ‘VICTORIA’. For example: EASTER2000 = HOLIDAY(’EASTER’, 2000);
returns the index of the seasonal cycle given an interval and an appropriate SAS date, datetime, or time value. For example, the seasonal cycle for INTERVAL=’DAY’ is ‘WEEK’, soINTCINDEX(’DAY’,’01SEP78’D); returns 35 since September 1, 1978, is the sixth day of the th week of the year. For correct results, date intervals should be used with date values, and datetime intervals should be used with datetime values.
returns the interval of the seasonal cycle, given a date, time, or datetime interval. For example, INTCYCLE(‘MONTH’) returns ‘YEAR’ since the months January, February, …, December constitute a yearly cycle. INTCYCLE(‘DAY’) returns ‘WEEK’ since Sunday, Monday, …, Saturday is a weekly cycle.
returns an interval that fits exactly between two SAS date, datetime, or observation values, in the sense of the INTNX function uses SAMEDAY alignment. In the following example, result1 is the same as date1 and result2 is the same as date2.
FitInterval = INTFIT( date1, date2, 'D' ); result1 = INTNX( FitInterval, date1, 0, 'SAMEDAY'); result2 = INTNX( FitInterval, date1, 1, 'SAMEDAY');
More than one interval can fit the preceding definition. For instance, two SAS date values that are seven days apart could be fit with either ‘DAY7’ or ‘WEEK’. The INTFIT algorithm chooses the more common interval, so ‘WEEK’ is the result when the dates are seven days apart. The INTFIT function can be used to detect the possible frequency of the time series or to analyze frequencies of other events in a time series, such as outliers or missing values.
returns a recommended format, given a date, time, or datetime interval for displaying the time ID values associated with a time series of the given interval. The valid values of size (‘long,’ ‘l,’ ‘short,’ ‘s’) specify whether the returned format uses a two-digit or four-digit year to display the SAS date value.
returns an interval that fits three consecutive SAS date or datetime values. The INTGET function examines two intervals: the first interval between date1 and date2, and the second interval between date2 and date3. In order for an interval to be detected, either the two intervals must be the same or one interval must be an integer multiple of the other interval. That is, INTGET assumes that at least two of the dates are consecutive points in the time series, and that the other two dates are also consecutive or represent the points before and after missing observations. The INTGET algorithm assumes that large values are SAS datetime values, which are measured in seconds, and that smaller values are SAS date values, which are measured in days. The INTGET function can be used to detect the possible frequency of the time series or to analyze frequencies of other events in a time series, such as outliers or missing values.
returns the seasonal index, given a date, time, or datetime interval and an appropriate date, time, or datetime value. The seasonal index is a number that represents the position of the date, time, or datetime value in the seasonal cycle of the specified interval. For example, INTINDEX(’MONTH’,’01DEC2000’D); returns 12 because monthly data is yearly periodic and DECEMBER is the th month of the year. However, INTINDEX(’DAY’,’01DEC2000’D); returns 6 because daily data is weekly periodic and December 01, 2000, is a Friday, the sixth day of the week. To correctly identify the seasonal index, the interval specification should agree with the date, time, or datetime value. For example, INTINDEX(’DTMONTH’,’01DEC2000’D); andINTINDEX(’MONTH’,’01DEC2000:00:00:00’DT); do not return the expected value of 12. However, both INTINDEX(’MONTH’,’01DEC2000’D); andINTINDEX(’DTMONTH’,’01DEC2000:00:00:00’DT); return the expected value of 12.
returns the date or datetime value of the beginning of the interval that is n intervals from the interval that contains the given date or datetime value. The optional alignment argument specifies that the returned date is aligned to the beginning, middle, or end of the interval. Beginning is the default. In addition, you can specify SAME (or S) alignment. The SAME alignment bases the alignment of the calculated date or datetime value on the alignment of the input date or datetime value. As illustrated in the following example, the SAME alignment can be used to calculate the meaning of “same day next year” or “same day 2 weeks from now.”
nextYear = INTNX( 'YEAR', '15Apr2007'D, 1, 'S' ); TwoWeeks = INTNX( 'WEEK', '15Apr2007'D, 2, 'S' );
The preceding example returns ’15Apr2008’D for nextYear and ’29Apr2007’D for TwoWeeks. For all values of alignment, the number of intervals n between the input date and the resulting date agrees with the input value.
date2 = INTNX( interval, date1, n1, align ); n2 = INTCK( interval, date1, date2 );
The result is always that n2 = n1.
returns the length of the seasonal cycle, given a date, time, or datetime interval. The length of a seasonal cycle is the number of intervals in a seasonal cycle. For example, when the interval for a time series is described as monthly, many procedures use the option INTERVAL=MONTH to indicate that each observation in the data then corresponds to a particular month. Monthly data are considered to be periodic for a one-year seasonal cycle. There are 12 months in one year, so the number of intervals (months) in a seasonal cycle (year) is 12. For quarterly data, there are 4 quarters in one year, so the number of intervals in a seasonal cycle is 4. The periodicity is not always one year. For example, INTERVAL=DAY is considered to have a seasonal cycle of one week, and because there are 7 days in a week, the number of intervals in a seasonal cycle is 7.
returns 1 if the interval name is a valid interval, 0 otherwise. For example, VALID = INTTEST(’MONTH’); should set VALID to 1, while VALID = INTTEST(’NOTANINTERVAL’); should set VALID to 0. The INTTEST function can be useful in verifying which values of multiplier n and the shift index s are valid in constructing an interval name.
returns the Julian date from a SAS date value. The format of the Julian date is either yyddd or yyyyddd depending on the value of the system option YEARCUTOFF=. For example, using the default system option values, JULDATE( ’31DEC1999’D ); returns 99365, while JULDATE( ’31DEC1899’D ); returns 1899365.
returns a SAS date value for the th weekday of the month and year specified. For example, Thanksgiving is always the fourth () Thursday () in November (). ThusTHANKS2000 = NWKDOM( 4, 5, 11, 2000); returns the SAS date value for Thanksgiving in the year 2000. The last weekday of a month can be specified using . Memorial Day in the United States is the last () Monday () in May (), and so MEMORIAL2002 = NWKDOM( 5, 2, 5, 2002); returns the SAS date value for Memorial Day in 2002. Because always specifies the last occurrence of the month and most months have only 4 instances of each day, the result for is often the same as the result for . NWKDOM is useful for calculating the SAS date values of holidays that are defined in this manner.
If the descriptor is ‘U,’ weeks start on Sunday and the range is to . If weeks and exist, they are only partial weeks. Week 52 can be a partial week.
If the descriptor is ‘V’, the result is equivalent to the ISO 8601 week of year definition. The range is to . Week is a leap week. The first week of the year, Week , and the last week of the year, Week or , can include days in another Gregorian calendar year.
If the descriptor is ‘W’, weeks start on Monday and the range is to . If weeks and exist, they are only partial weeks. Week 52 can be a partial week.
data _null_; do year=2006 to 2009; fdoy=mdy(1,1,year); /* For years prior to 2007, daylight time begins in the United States on */ /* the first Sunday in April and ends on the last Sunday in October. */ if year <= 2006 then do; fdo_apr=intnx('month',fdoy,3); dst_beg=intnx('week.1',fdo_apr,(weekday(fdo_apr) ne 1)); fdo_oct=intnx('month',fdoy,9); dst_end=intnx('week.1',fdo_oct,(weekday(fdo_oct) in (6,7))+4); end; /* Due to the Energy Policy Act of 2005, Pub. L. no. 109-58, 119 Stat 594 */ /* (2005). Starting in March 2007, daylight time in the United States */ /* will begin on the second Sunday in March and end on the first Sunday */ /* in November. For more information, one reference is */ /* http://aa.usno.navy.mil */ else do; fdo_mar=intnx('month',fdoy,2); dst_beg=intnx('week.1',fdo_mar, (weekday(fdo_mar) in (2,3,4,5,6,7))+1); fdo_nov=intnx('month',fdoy,10); dst_end=intnx('week.1',fdo_nov,(weekday(fdo_nov) ne 1)); end; put dst_beg= worddate. / dst_end= worddate. / ; end; run;