バックログなし。

https://www.it-swarm-ja.com/ja/apache/ssl%E3%82%92%E4%BD%BF%E7%94%A8%E3%81%99%E3%82%8Bapachecer%E3%82%92crt%E8%A8%BC%E6%98%8E%E6%9B%B8%E3%81%AB%E5%A4%89%E6%8F%9B%E3%81%99%E3%82%8B%E6%96%B9%E6%B3%95/957865024/

■タイトル: .NET for Apache SparkでStreaming系のサンプルの一部が動作しない。

■製品名: .NET for Apache Spark

■質問内容:

現象: https://github.com/dotnet/spark/tree/main/examples/Microsoft.Spark.CSharp.Examples/Sql/Streaming

が一部動作しない。

発生環境:

Windows 10 VisualStudio? 2019 .NET Core App 3.1

PATH

C:\prog\dev\spark\spark-3.0.1-bin-hadoop2.7\;
C:\prog\dev\spark\spark-3.0.1-bin-hadoop2.7\bin;

HADOOP_HOME=C:\prog\dev\spark\spark-3.0.1-bin-hadoop2.7\ SPARK_HOME=C:\prog\dev\spark\spark-3.0.1-bin-hadoop2.7\

※ winutils.exeは、hadoop-2.7.1 のものを使用。

再現手順:

https://github.com/dotnet/spark/tree/main/examples/Microsoft.Spark.CSharp.Examples/Sql/Streaming

をビルドして実行する。

StructuredNetworkCharacterCount?.csを Program.csに切り出して実行してもイイ。

spark-submit --class org.apache.spark.deploy.dotnet.DotnetRunner?

  • master local microsoft-spark-3-0_2.12-2.0.0.jar Microsoft.Spark.CSharp.Examples.exe Sql.Streaming.StructuredNetworkCharacterCount? localhost 9999

→ 上手く動作する。

spark-submit --class org.apache.spark.deploy.dotnet.DotnetRunner?

  • master local microsoft-spark-3-0_2.12-2.0.0.jar Microsoft.Spark.CSharp.Examples.exe Sql.Streaming.StructuredNetworkWordCount? localhost 9999

→ 無限ループのような動作(ログは別途添付

spark-submit --class org.apache.spark.deploy.dotnet.DotnetRunner?

  • master local microsoft-spark-3-0_2.12-2.0.0.jar Microsoft.Spark.CSharp.Examples.exe Sql.Streaming.StructuredNetworkWordCountWindowed? localhost 9999 10 10

→ 同上

質問事項:

構造化ストリーミングのスライディングイベントタイムウィンドウにわたる集約処理を、上手く動作させる方法、トラブルシュートの方法を教えてほしい。

背景: -

調査状況: 以下については実行できている。

.NET for Apache Spark の概要 | Microsoft Docs (チュートリアル: .NET for Apache Spark の概要) https://docs.microsoft.com/ja-jp/dotnet/spark/tutorials/get-started

DataFrame? words = lines

                .Select(Explode(Split(lines["value"], " "))
                    .Alias("word"));
            DataFrame wordCounts = words.GroupBy("word").Count();
            Spark.Sql.Streaming.StreamingQuery query = wordCounts
                .WriteStream()
                .OutputMode("complete")
                .Format("console")
                .Start();
            DataFrame words = lines
                .Select(Explode(Split(lines["value"], " "))
                    .Alias("word"));
            //DataFrame wordCounts = words.GroupBy("word").Count();
            Spark.Sql.Streaming.StreamingQuery query = words
                .WriteStream()
                //.OutputMode("complete")
                .Format("console")
                .Start();

21/07/30 10:10:25 INFO ShuffleBlockFetcherIterator?: Getting 0 (0.0 B) non-empty blocks including 0 (0.0 B) local and 0 (0.0 B) host-local and 0 (0.0 B) remote blocks 21/07/30 10:10:25 INFO ShuffleBlockFetcherIterator?: Started 0 remote fetches in 0 ms 21/07/30 10:10:25 INFO CheckpointFileManager?: Writing atomically to file:/C:/Users/nishi/AppData?/Local/Temp/temporary-df750248-d60c-4c7d-a6ea-72e0a2e82006/state/0/20/1.delta using temp file file:/C:/Users/nishi/AppData?/Local/Temp/temporary-df750248-d60c-4c7d-a6ea-72e0a2e82006/state/0/20/.1.delta.7ee17d60-cb84-45be-bdaa-dec70650391a.TID21.tmp 21/07/30 10:10:25 INFO CheckpointFileManager?: Renamed temp file file:/C:/Users/nishi/AppData?/Local/Temp/temporary-df750248-d60c-4c7d-a6ea-72e0a2e82006/state/0/20/.1.delta.7ee17d60-cb84-45be-bdaa-dec70650391a.TID21.tmp to file:/C:/Users/nishi/AppData?/Local/Temp/temporary-df750248-d60c-4c7d-a6ea-72e0a2e82006/state/0/20/1.delta 21/07/30 10:10:25 INFO HDFSBackedStateStoreProvider?: Committed version 1 for HDFSStateStore?[id=(op=0,part=20),dir=file:/C:/Users/nishi/AppData?/Local/Temp/temporary-df750248-d60c-4c7d-a6ea-72e0a2e82006/state/0/20] to file file:/C:/Users/nishi/AppData?/Local/Temp/temporary-df750248-d60c-4c7d-a6ea-72e0a2e82006/state/0/20/1.delta 21/07/30 10:10:25 INFO DataWritingSparkTask?: Commit authorized for partition 20 (task 21, attempt 0, stage 1.0) 21/07/30 10:10:25 INFO DataWritingSparkTask?: Committed partition 20 (task 21, attempt 0, stage 1.0) 21/07/30 10:10:25 INFO CheckpointFileManager?: Writing atomically to file:/C:/Users/nishi/AppData?/Local/Temp/temporary-df750248-d60c-4c7d-a6ea-72e0a2e82006/state/0/20/1.delta using temp file file:/C:/Users/nishi/AppData?/Local/Temp/temporary-df750248-d60c-4c7d-a6ea-72e0a2e82006/state/0/20/.1.delta.bb85ef3a-a36f-4ce4-97d8-5e7a1ce9e278.TID21.tmp 21/07/30 10:10:25 INFO HDFSBackedStateStoreProvider?: Aborted version 1 for HDFSStateStore?[id=(op=0,part=20),dir=file:/C:/Users/nishi/AppData?/Local/Temp/temporary-df750248-d60c-4c7d-a6ea-72e0a2e82006/state/0/20] 21/07/30 10:10:25 INFO Executor: Finished task 20.0 in stage 1.0 (TID 21). 5186 bytes result sent to driver 21/07/30 10:10:25 INFO TaskSetManager?: Starting task 21.0 in stage 1.0 (TID 22, nishi.mshome.net, executor driver, partition 21, PROCESS_LOCAL, 7325 bytes) 21/07/30 10:10:25 INFO Executor: Running task 21.0 in stage 1.0 (TID 22) 21/07/30 10:10:25 INFO TaskSetManager?: Finished task 20.0 in stage 1.0 (TID 21) in 233 ms on nishi.mshome.net (executor driver) (21/200) 21/07/30 10:10:25 INFO StateStore?: Retrieved reference to StateStoreCoordinator?: org.apache.spark.sql.execution.streaming.state.StateStoreCoordinatorRef?@4a0e091 21/07/30 10:10:25 INFO StateStore?: Reported that the loaded instance StateStoreProviderId?(StateStoreId?(file:/C:/Users/nishi/AppData?/Local/Temp/temporary-df750248-d60c-4c7d-a6ea-72e0a2e82006/state,0,21,default),5db0ae43-3bed-4178-a4a0-e67a50ce34a5) is active 21/07/30 10:10:25 INFO HDFSBackedStateStoreProvider?: Retrieved version 0 of HDFSStateStoreProvider?[id = (op=0,part=21),dir = file:/C:/Users/nishi/AppData?/Local/Temp/temporary-df750248-d60c-4c7d-a6ea-72e0a2e82006/state/0/21] for update 21/07/30 10:10:25 INFO StateStore?: Retrieved reference to StateStoreCoordinator?: org.apache.spark.sql.execution.streaming.state.StateStoreCoordinatorRef?@7ac00194 21/07/30 10:10:25 INFO StateStore?: Reported that the loaded instance StateStoreProviderId?(StateStoreId?(file:/C:/Users/nishi/AppData?/Local/Temp/temporary-df750248-d60c-4c7d-a6ea-72e0a2e82006/state,0,21,default),5db0ae43-3bed-4178-a4a0-e67a50ce34a5) is active 21/07/30 10:10:25 INFO HDFSBackedStateStoreProvider?: Retrieved version 0 of HDFSStateStoreProvider?[id = (op=0,part=21),dir = file:/C:/Users/nishi/AppData?/Local/Temp/temporary-df750248-d60c-4c7d-a6ea-72e0a2e82006/state/0/21] for update


トップ   編集 凍結 差分 バックアップ 添付 複製 名前変更 リロード   新規 一覧 単語検索 最終更新   ヘルプ   最終更新のRSS
Last-modified: 2021-08-02 (月) 11:11:24 (6h)