
#MAGIC LAUNCHER IOEXCEPTION UPDATE#
On Apache Ranger-enabled Amazon EMR clusters, you can use Apache Spark SQL to insert data into or update the Apache Hive metastore tables using INSERT INTO, INSERT OVERWRITE, and ALTER TABLE. On shuffle operations, see Using EMR managed scaling in Amazon EMR in the Amazon EMR Management Guide and Spark Programming Guide. (data that Spark redistributes across partitions to perform specific operations). Later, and EMR versions 6.4.0 and later, managed scaling is now Spark shuffle data aware Spark shuffle data managed scaling optimization - For Amazon EMR versions 5.34.0 and Scala version 2.12.10 (OpenJDK 64-Bit Server VM, Java 1.8.0_282)Ĭonnectors and drivers: DynamoDB Connector 4.16.0 Java JDK version Corretto-8.302.08.1 (build 1.8.0_302-b08)Īpache Ranger KMS (multi-master transparent encryption) version 2.0.0 ChangesĪmazon EMR Kinesis Connector version 3.5.0ĪWS Glue Hive Metastore Client version 3.3.0 The following release notes include information for Amazon EMR release version 6.5.0. To use Hue on Amazon EMR 6.4.0, either manually start HttpFS server on the Amazon EMR master node using sudo systemctl start hadoop-httpfs, or use an Amazon EMR step.

Hue queries do not work in Amazon EMR 6.4.0 because Apache Hadoop HttpFS server is disabled by default. The workaround is to start HttpFS server before connecting the EMR notebook to the cluster using sudo systemctl start hadoop-httpfs. In this case, the EMR notebook cannot connect to the cluster that has Livy impersonation enabled. The Amazon EMR Notebooks feature used with Livy user impersonation does not work because HttpFS is disabled by default. HttpFS server can be started by using sudo systemctl start hadoop-httpfs. You can re-enable WebHDFS using the Hadoop configuration,. WebHDFS and HttpFS server are disabled by default. Spark performance improvement - heterogeneous executors are disabled when certain Spark configuration values are overridden in EMR 5.34.0. This was because the Amazon EMR on-cluster daemon did not renew the Kerberos ticket, which is required to securely communicate with HDFS/YARN running on the master node. Fixed an issue where job failures occurred due to a race condition in YARN decommissioning when cluster tried to scale up or down.įixed issue with step or job failures during cluster scaling by ensuring that the node states are always consistent between the Amazon EMR on-cluster daemons and YARN/HDFS.įixed an issue where cluster operations such as scale down and step submission failed for Amazon EMR clusters enabled with Kerberos authentication. Fixed an issue where job failures occurred during cluster scale-down as Spark was assuming all available nodes were deny-listed. Improved EMR on-cluster daemons to correctly track the node states when IP addresses are reused to improve reliability during scaling operations. This was happening because on-cluster daemons were not able to communicate the health status data of a node to internal Amazon EMR components.

Amazon EMR now removes the decommissioned or lost node records older than one hour from the Zookeeper file and the internal limits have been increased.įixed an issue where scaling requests failed for a large, highly utilized cluster when Amazon EMR on-cluster daemons were running health checking activities, such as gathering YARN node state and HDFS node state. This caused default limits to be exceeded in certain situations.

#MAGIC LAUNCHER IOEXCEPTION MANUAL#
Previously, manual restart of the resource manager on a multi-master cluster caused Amazon EMR on-cluster daemons, like Zookeeper, to reload all previously decommissioned or lost nodes in the Zookeeper znode file. This is a release to fix issues with Amazon EMR Scaling when it fails to scale up/scale down a cluster successfully or causes application failures. If you recall the old days just before Java SE 11 (JDK 11), say you have a HelloUniverse.Changes, Enhancements, and Resolved Issues
