Skip to main content

Uniffle Configuration

Common Settings

Common settings are common settings both used by coordinators and shuffle servers

Property NameDefaultDescription
rss.coordinator.quorum-Coordinator quorum list
rss.rpc.server.typeGRPC
rss.rpc.server.port-rpc service port for the coordinator and the shuffle server
rss.rpc.message.max.size1073741824Max size of rpc message
rss.rpc.metrics.enabledtrueenable rpc related metrics
rss.jetty.http.portmetrics service port
rss.jetty.corePool.size256metrcis service jetty corePool size
rss.jetty.maxPool.size256metrics service jetty max pool size
rss.jetty.stop.timeout30000jetty stop timeout (ms)
rss.jetty.http.idle.timeout30000jetty http idle timeout (ms)
rss.rpc.client.typeGRPCclient rpc type
rss.storage.type-Supports MEMORY_LOCALFILE, MEMORY_HDFS, MEMORY_LOCALFILE_HDFS
rss.storage.data.replica1replica num of storage
rss.rpc.executor.size1000thread number for grpc to process request
rss.jvm.metrics.verbose.enabletrueenable jvm metrics verbose

Coordinator

Property NameDefaultDescription
rss.coordinator.exclude.nodes.file.path-The path of configuration file which have exclude nodes
rss.coordinator.exclude.nodes.check.interval.ms60000Update interval (ms) for exclude nodes
rss.coordinator.server.heartbeat.timeout30000Timeout if can't get heartbeat from shuffle server
rss.coordinator.server.periodic.output.interval.times30The periodic interval times of output alive nodes.The interval sec can be calculated by (rss.coordinator.server.heartbeat.timeout/3 * rss.coordinator.server.periodic.output.interval.times). Default output interval is 5min.
rss.coordinator.assignment.strategyPARTITION_BALANCEStrategy for assigning shuffle server, PARTITION_BALANCE should be used for workload balance
rss.coordinator.app.expired60000Application expired time (ms), the heartbeat interval should be less than it
rss.coordinator.shuffle.nodes.max9The max number of shuffle server when do the assignment
rss.coordinator.dynamicClientConf.path-The path of configuration file which have default conf for rss client
rss.coordinator.access.checkersorg.apache.uniffle.coordinator.AccessClusterLoadCheckerThe access checkers will be used when the spark client use the DelegationShuffleManager, which will decide whether to use rss according to the result of the specified access checkers
rss.coordinator.access.loadChecker.serverNum.threshold-The minimal required number of healthy shuffle servers when being accessed by client. And when not specified, it will use the required shuffle-server number from client as the checking condition. If there is no client shuffle-server number specified, the coordinator conf of rss.coordinator.shuffle.nodes.max will be adopted
rss.coordinator.access.candidates.updateIntervalSec120Accessed candidates update interval in seconds, which is only valid when AccessCandidatesChecker is enabled.
rss.coordinator.access.candidates.path-Accessed candidates file path, the file can be stored on HDFS
rss.coordinator.access.loadChecker.memory.percentage15.0The minimal percentage of available memory percentage of a server
rss.coordinator.dynamicClientConf.enabledfalsewhether to enable dynamic client conf, which will be fetched by spark client
rss.coordinator.dynamicClientConf.path-The dynamic client conf of this cluster and can be stored in HDFS or local
rss.coordinator.dynamicClientConf.updateIntervalSec120The dynamic client conf update interval in seconds
rss.coordinator.remote.storage.cluster.conf-Remote Storage Cluster related conf with format $clusterId,$key=$value, separated by ';'
rss.coordinator.remote.storage.select.strategyAPP_BALANCEStrategy for selecting the remote path
rss.coordinator.remote.storage.schedule.time60000The time of scheduling the read and write time of the paths to obtain different HDFS
rss.coordinator.remote.storage.schedule.file.size204800000The size of the file that the scheduled thread reads and writes
rss.coordinator.remote.storage.schedule.access.times3The number of times to read and write HDFS files
rss.coordinator.assignment.host.strategy-Strategy for selecting shuffle servers
rss.coordinator.startup-silent-period.enabledfalseEnable the startup-silent-period to reject the assignment requests for avoiding partial assignments. To avoid service interruption, this mechanism is disabled by default. Especially it's recommended to use in coordinator HA mode when restarting single coordinator.
rss.coordinator.startup-silent-period.duration20000The waiting duration(ms) when conf of rss.coordinator.startup-silent-period.enabled is enabled.

Shuffle Server

Property NameDefaultDescription
rss.server.buffer.capacity-Max memory of buffer manager for shuffle server
rss.server.read.buffer.capacity10000Max size of buffer for reading data
rss.server.heartbeat.delay10000rss heartbeat initial delay (ms)
rss.server.heartbeat.threadNum2rss heartbeat thread number
rss.server.heartbeat.interval10000Heartbeat interval to Coordinator (ms)
rss.server.heartbeat.timeout10000Heartbeat timeout to Coordinator (ms)
rss.server.flush.threadPool.size10Thread pool for flush data to file
rss.server.flush.threadPool.queue.sizeInteger.MAX_VALUEsize of waiting queue for thread pool
rss.server.flush.thread.alive120thread idle time in pool (s)
rss.server.commit.timeout600000Timeout when commit shuffle data (ms)
rss.server.commit.check.interval.max.ms10000Max interval(ms) for check commit status
rss.server.write.retry.max10Retry times when write fail
rss.server.app.expired.withoutHeartbeat60000Expired time (ms) for application which has no heartbeat with coordinator
rss.server.memory.request.retry.max50Max times to retry for memory request
rss.server.preAllocation.expired20000Expired time (ms) for pre allocated buffer
rss.server.write.slow.threshold10000Threshold for write slow defined
rss.server.event.size.threshold.l1200000Threshold for event1 size
rss.server.event.size.threshold.l2200000Threshold for event2 size
rss.server.event.size.threshold.l3200000Threshold for event3 size
rss.server.cleanup.threshold10.0Threshold for disk cleanup
rss.server.pending.event.timeout.sec600If disk cannot be written for timeout seconds, the flush data event will fail
rss.server.shuffle.expired.timeout.ms120000shuffle data reserve timeout
rss.server.index.size.hint2097152The index file size hint
rss.server.health.max.storage.usage.percentage90.0The usage percentage of a storage exceed the value, the disk become unavailable
rss.server.health.storage.recovery.usage.percentage80.0The usage percentage of an unavailable storage decline the value, the disk will become available
rss.server.health.check.interval.ms5000The interval for health check
rss.server.health.min.storage.percentage80The minimum fraction of storage that must pass the check mark the node as healthy
rss.server.health.check.enablefalseThe switch for the health check
rss.server.health.checker.class.names-The list of the Checker's name
rss.server.memory.shuffle.highWaterMark.percentage75.0Threshold of spill data to storage, percentage of rss.server.buffer.capacity
rss.server.memory.shuffle.lowWaterMark.percentage25.0Threshold of keep data in memory, percentage of rss.server.buffer.capacity
rss.server.flush.cold.storage.threshold.size64MThe threshold of data size for LOACALFILE and HDFS if MEMORY_LOCALFILE_HDFS is used
rss.server.tags-The comma-separated list of tags to indicate the shuffle server's attributes. It will be used as the assignment basis for the coordinator
rss.server.single.buffer.flush.enabledfalseWhether single buffer flush when size exceeded rss.server.single.buffer.flush.threshold
rss.server.single.buffer.flush.threshold64MThe threshold of single shuffle buffer flush
rss.server.disk.capacity-1Disk capacity that shuffle server can use. If it's negative, it will use the default disk whole space
rss.server.multistorage.fallback.max.fail.times0For multistorage, fail times exceed the number, will switch storage
rss.server.localstorage.initialize.max.fail.number0For localstorage, it will exit when the failed initialized local storage exceed the number
rss.server.single.buffer.flush.enabledfalseWhether single buffer flush when size exceeded rss.server.single.buffer.flush.threshold
rss.server.single.buffer.flush.threshold67108864The threshold of single shuffle buffer flush

Client

Common Setting

Common settings are common settings both used by spark and mr, but each carries different prefix. spark. is used by the spark client and mapreduce. is used by the mr client.

Property NameDefaultDescription
rss.c.ient.typeGRPCrpc client type
rss.client.retry.max100max rpc retry count
rss.client.retry.interval.max10000rpc retry interval (ms)
rss.coordinator.quorum-coordinator quorum to acess
rss.data.replica1data replica num
rss.data.replica.write1quorum write num
rss.data.replica.read1quorum read num
rss.data.replica.skip.enabledtrueenable data replica skipping
rss.client.data.transfer.pool.sizeauto get host's processors numdata transfer threadpool size
rss.client.data.commit.pool.size-1data commit pool size
rss.heartbeat.interval10000client heartbaet interval
rss.heartbeat.timeout-client heartbeat timeout
rss.storage.type-storage type of shuffle data, supports MEMORY_LOCALFILE, MEMORY_HDFS, MEMORY_LOCALFILE_HDFS
rss.client.send.check.interval.ms500check send result interval (ms)
rss.client.send.check.timeout.ms600000send result checkout timeout (ms)
rss.writer.buffer.size3mBuffer size for single partition data
rss.remote.storage.path-remote storage path
rss.index.read.limit500index num limit for each read
rss.client.read.buffer.size14mbuffer size for each read
rss.client.assignment.tags-client assigment tags
rss.client.assignment.retry.interval-max client asssigment retry times
rss.access.timeout.ms10000cluster access timeout
rss.dynamicClientConf.enabledtrueuse dynamic conf from coordinator
rss.client.assignment.shuffle.nodes.max-1max assigned shuffle server nodes

Spark Specialized Setting

Property NameDefaultDescription
spark.rss.writer.serializer.buffer.size3kserializer buffer size
spark.rss.writer.buffer.segment.size3kbuffer size pre allocated for each records add
spark.rss.writer.buffer.spill.size128mBuffer size for total partition data
spark.rss.writer.pre.allocated.buffer.size16mallocated size for each request memory from spark executor
spark.rss.writer.require.memory.retryMax1200max memory request times
spark.rss.writer.require.memory.interval1000memory request interval (ms)
spark.rss.client.heartBeat.threadNum4spark client heartbeat thread num
spark.rss.client.send.size.limit16mThe max data size sent to shuffle server
spark.rss.client.unregister.thread.pool.size10The max size of thread pool of unregistering
spark.rss.client.unregister.request.timeout.sec10The max timeout sec when doing unregister to remote shuffle-servers
spark.rss.client.send.threadPool.size10client send data threadpool size of each executor
spark.rss.client.send.threadPool.keepalive60maximum time that excess idle threads will wait for new tasks before terminating
spark.rss.client.bitmap.splitNum1bitmap split num
spark.rss.access.id-auto access id
spark.rss.enabledfalseenable to use RssShuffleManager, this is used in auto access mode (DelegationRssShuffleManager)
spark.rss.client.access.retry.interval.ms20000auto access retry interval
spark.rss.client.access.retry.times0auto access retry count

MapReduce Specialized Setting

Property NameDefaultDescription
mapreduce.rss.client.send.threshold0.2threshold to send data
mapreduce.rss.client.sort.memory.use.threshold-threshold of memory to sort in the client
mapreduce.rss.client.max.buffer.size3kThe max buffer size in map side
mapreduce.rss.client.batch.trigger.num50The max batch of buffers to send data in map side
mapreduce.rss.client.memory.threshold0.8Threshold to send data in the mapper
mapreduce.rss.reduce.remote.spill.enablefalseWhether to use remote spill (Experimental, MEMORY_LOCAL_HDFS mode)
mapreduce.rss.reduce.remote.spill.attempt.inc1Increase reduce attempts as hdfs is easier to crash than disk (Experimental, MEMORY_LOCAL_HDFS mode)
mapreduce.rss.reduce.remote.spill.replication1The replication number to spill data to hdfs (Experimental, MEMORY_LOCAL_HDFS mode)
mapreduce.rss.reduce.remote.spill.retries5The retry number to spill data to hdfs (Experimental, MEMORY_LOCAL_HDFS mode)
mapreduce.rss.remote.storage.conf-remote storage cluster conf

Security

Property NameDefaultDescription
rss.security.hadoop.kerberos.enablefalseWhether enable access secured hadoop cluster
rss.security.hadoop.kerberos.krb5-conf.file-The file path of krb5.conf. And only when rss.security.hadoop.kerberos.enable is enabled, the option will be valid
rss.security.hadoop.kerberos.keytab.file-The kerberos keytab file path. And only when rss.security.hadoop.kerberos.enable is enabled, the option will be valid
rss.security.hadoop.kerberos.principal-The kerberos keytab principal. And only when rss.security.hadoop.kerberos.enable is enabled, the option will be valid
rss.security.hadoop.kerberos.relogin.interval.sec60The kerberos authentication relogin interval. unit: sec
rss.jetty.ssl.enablefalseenable metrics service jetty ssl
rss.jetty.https.port-metrics service jetty ssl port
rss.jetty.ssl.keystore.path-jetty ssl keystore path
rss.jetty.ssl.keymanager.password-jetty ssl keymanager password
rss.jetty.ssl.keystore.password-jetty ssl keystore password
rss.jetty.ssl.truststore.password-jetty ssl truststore password