I get the following error when I execute a PIG script using the "Hadoop PIG Task" component in SSIS 2016.
<tt>[Hadoop Pig Task] Error: {"error":"File /opt/cloudera/parcels/CDH-5.8.2-1.cdh5.8.2.p0.3/lib/hive-hcatalog/sbin/../share/webhcat/svr/lib/zookeeper-3.4.3.jar does not exist."</tt>}
Does this error occur due to an error in our Cloudera Hadoop configuration? – or does the SSIS Hadoop components require a min. or certain version of Cloudera Hadoop?
Would we then potentially risc breaking our ETL setup if we upgrade the Hadoop platform?
Microsofts MSDN docs do not seem to mention anything about requirements on the hadoop platform side in order for the SSIS Hadoop components to work.
Further more our IT operations partner says that the WebHcat service should not be used to execute Pig or Hive scripts, it should only be used for querying hadoop meta data” - Can someone explain the inner workings of the SSIS hadoop components and whether our operations partner has a point or not?