我有两个时间戳作为输入.我想计算这些时间戳之间的时差,不包括星期日.
我可以在hive中使用datediff函数获取天数.
我可以使用from_unixtime(unix_timestamp(startdate),'EEEE')获取特定日期的日期.
但我不知道如何将这些功能联系起来以达到我的要求,或者是否还有其他简单的方法来实现这一目标.
提前致谢.
您可以编写一个自定义UDF,它包含两个包含日期作为输入的列,并计算不包括星期日的日期之间的差异.
import java.text.ParseException;
import java.text.SimpleDateFormat;
import java.util.ArrayList;
import java.util.List;
import java.util.Date;
import org.apache.hadoop.hive.ql.exec.UDF;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
public class IsoYearWeek extends UDF {
public LongWritable evaluate(Text dateString,Text dateString1) throws ParseException { //takes the two columns as inputs
SimpleDateFormat date = new SimpleDateFormat("dd/MM/yyyy");
/* String date1 = "20/07/2016";
String date2 = "28/07/2016";
*/ int count=0;
List<Date> dates = new ArrayList<Date>();
Date startDate = (Date)date.parse(dateString.toString());
Date endDate = (Date)date.parse(dateString1.toString());
long interval = 24*1000 * 60 * 60; // 1 hour in millis
long endTime =endDate.getTime() ; // create your endtime here, possibly using Calendar or Date
long curTime = startDate.getTime();
while (curTime <= endTime) {
dates.add(new Date(curTime));
curTime += interval;
}
for(int i=0;i<dates.size();i++){
Date lDate =(Date)dates.get(i);
if(lDate.getDay()==0){
count+=1; //counts the number of sundays in between
}
}
long days_diff = (endDate.getTime()-startDate.getTime())/(24 * 60 * 60 * 1000)-count; //displays the days difference excluding sundays
return new LongWritable(days_diff);
}
}
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
784 次 |
| 最近记录: |